hello, I download your rl dataset named by csfufu/mmrl, it seems that the image in data with dataset=None doesn't match its question. (some data with other dataset label also seems to be with mismatched image)
This is my code to check data.
from io import BytesIO
from PIL import Image
import base64
import pyarrow.parquet as pq
def get_base64_image(image_byte):
base64_str = base64.b64encode(image_byte).decode()
return base64_str
file = "train1.parquet"
table = pq.read_table(file)
df = table.to_pandas()
for index, row in tqdm.tqdm(df.iterrows()):
if index == 24592:
text = row["problem"]
images = row["images"]
dataset = row["dataset"]
print(text)
image_content = get_base64_image(images[0]["bytes"])
image_pil = Image.open(BytesIO(base64.b64decode(image_content)))
break
for example, for question 24592, the question is
As shown in the figure, in rhombus $ABCD$, $\angle A=60^\circ$, $E$ is the midpoint of $AD$, and $F$ is the midpoint of $AB$. If $EF=\sqrt{5}$, what is the perimeter of rhombus $ABCD$?
and the image is
anther example for question 26309
As shown in the figure, quadrilateral $ABCD$ is inscribed in $\odot O$, and point $M$ is on the extension of $AD$. If $\angle AOC = 142^\circ$, what is the degree measure of $\angle CDM$?
and the image is
can you guys check this problem?
update:
data with other dataset label maybe also wrong, for example data with index 19708
Given a circle with two secants as shown at the right. Find the measure of the arc designated by x.![]()
its image:
