{"id":26360,"date":"2024-03-22T00:27:34","date_gmt":"2024-03-21T23:27:34","guid":{"rendered":"https:\/\/www.graviton.at\/letterswaplibrary\/80-million-tiny-images-dataset-image-decoding-problem\/"},"modified":"2024-03-22T00:27:34","modified_gmt":"2024-03-21T23:27:34","slug":"80-million-tiny-images-dataset-image-decoding-problem","status":"publish","type":"post","link":"https:\/\/www.graviton.at\/letterswaplibrary\/80-million-tiny-images-dataset-image-decoding-problem\/","title":{"rendered":"80 Million Tiny Images Dataset Image Decoding Problem"},"content":{"rendered":"<p><!-- SC_OFF --><\/p>\n<div class=\"md\">\n<p>I can&#8217;t get to visualize correctly the dataset, i&#8217;ve tried to convert the matlab script into a python script but this is the result:<\/p>\n<p><a href=\"https:\/\/drive.google.com\/file\/d\/1kzA7mNC4th8nbJh4iGoaZJB_xV4HO7r_\/view?usp=sharing\">https:\/\/drive.google.com\/file\/d\/1kzA7mNC4th8nbJh4iGoaZJB_xV4HO7r_\/view?usp=sharing<\/a><\/p>\n<p>and this is the adapted script: <\/p>\n<p> import numpy as np  <\/p>\n<p>import os import matplotlib.pyplot as plt<\/p>\n<p>def load_tiny_images(ndx, filename=None): if filename is None: filename = &#8216;Z:\/Tiny_Images_Dataset\/data\/tiny_images.bin&#8217; # filename = &#8216;C:\/atb\/Databases\/Tiny Images\/tiny_images.bin&#8217;<\/p>\n<p> sx = 32 #side size Nimages = len(ndx) nbytes_per_image = sx * sx * 3 img = np.zeros((sx * sx * 3, Nimages), dtype=np.uint8) pointer = (np.array(ndx) &#8211; 1) * nbytes_per_image # read data with open(filename, &#8216;rb&#8217;) as f: for i in range(Nimages): f.seek(pointer[i]) # moves the pointer to the beginning of the image img[:, i] = np.frombuffer(f.read(nbytes_per_image), dtype=np.uint8) img = img.reshape((sx, sx, 3, Nimages)) return img  <\/p>\n<p>def show_images(images): N = images.shape[3] fig, axes = plt.subplots(1, N, figsize=(N, 1)) if N == 1: axes = [axes] for i, ax in enumerate(axes): ax.imshow(images[:, :, :, i]) ax.axis(&#8216;off&#8217;) plt.show()<\/p>\n<h1>load the first 10\/79302017 imgs<\/h1>\n<p>img = load_tiny_images(list(range(1, 11)))<\/p>\n<p>show_images(img)<\/p>\n<p>What am i missing? is anyone able to correctly open it with python?<\/p>\n<p>\u200b<\/p>\n<p>just for completeness, this is the original matlab code (i&#8217;m a total zero in matlab):<\/p>\n<p>\u200b<\/p>\n<p> function img = loadTinyImages(ndx, filename)  <\/p>\n<p>% % Random access into the file of tiny images. % % It goes faster if ndx is a sorted list % % Input: % ndx = vector of indices % filename = full path and filename % Output: % img = tiny images [32x32x3xlength(ndx)]<\/p>\n<p>if nargin == 1 filename = &#8216;Z:Tiny_Images_Datasetdatatiny_images.bin&#8217;; % filename = &#8216;C:atbDatabasesTiny Imagestiny_images.bin&#8217;; end<\/p>\n<p>% Images sx = 32; Nimages = length(ndx); nbytesPerImage = sx<em>sx<\/em>3; img = zeros([sx<em>sx<\/em>3 Nimages], &#8216;uint8&#8217;);<\/p>\n<p>% Pointer pointer = (ndx-1)*nbytesPerImage; offset = pointer; offset(2:end) = offset(2:end)-offset(1:end-1)-nbytesPerImage;<\/p>\n<p>% Read data [fid, message] = fopen(filename, &#8216;r&#8217;); if fid == -1 error(message); end frewind(fid) for i = 1:Nimages fseek(fid, offset(i), &#8216;cof&#8217;); tmp = fread(fid, nbytesPerImage, &#8216;uint8&#8217;); img(:,i) = tmp; end fclose(fid);<\/p>\n<p>img = reshape(img, [sx sx 3 Nimages]);<\/p>\n<p>% load in first 10 images from 79,302,017 images img = loadTinyImages([1:10]);<\/p>\n<p>useless to say: in matlab nothing is working, it gives me some path error i have no idea how to resolve and it shows no image etc, i can&#8217;t learn matlab now so i&#8217;d like to read this huge bin file with python, am i that fool?<\/p>\n<p>\u200b<\/p>\n<p>Thanks a lot in advance for any help and sorry about my english<\/p>\n<\/div>\n<p><!-- SC_ON -->   submitted by   <a href=\"https:\/\/www.reddit.com\/user\/AstroGippi\"> \/u\/AstroGippi <\/a> <br \/> <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1bkko1u\/80_million_tiny_images_dataset_image_decoding\/\">[link]<\/a><\/span>   <span><a href=\"https:\/\/www.reddit.com\/r\/datasets\/comments\/1bkko1u\/80_million_tiny_images_dataset_image_decoding\/\">[comments]<\/a><\/span><\/p><div class='watch-action'><div class='watch-position align-right'><div class='action-like'><a class='lbg-style1 like-26360 jlk' href='javascript:void(0)' data-task='like' data-post_id='26360' data-nonce='65e0e39b87' rel='nofollow'><img class='wti-pixel' src='https:\/\/www.graviton.at\/letterswaplibrary\/wp-content\/plugins\/wti-like-post\/images\/pixel.gif' title='Like' \/><span class='lc-26360 lc'>0<\/span><\/a><\/div><\/div> <div class='status-26360 status align-right'><\/div><\/div><div class='wti-clear'><\/div>","protected":false},"excerpt":{"rendered":"<p>I can&#8217;t get to visualize correctly the dataset, i&#8217;ve tried to convert the matlab script into a&#8230;<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[85],"tags":[],"class_list":["post-26360","post","type-post","status-publish","format-standard","hentry","category-datatards","wpcat-85-id"],"_links":{"self":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/26360","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/comments?post=26360"}],"version-history":[{"count":0,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/posts\/26360\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/media?parent=26360"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/categories?post=26360"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.graviton.at\/letterswaplibrary\/wp-json\/wp\/v2\/tags?post=26360"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}