IF Archive submissions by year

In case anybody’s curious, here are some charts of submissions per year and total submissions. Blue is number of files, green is total file size (in bytes).

I slurped this data out of the Master-Index.xml file, which is always available at http://ifarchive.org/indexes/Master-Index.xml.

(A handful of files show a timestamp before 1992. These are mostly IFComp entries which were submitted with bad timestamps. E.g. http://ifarchive.org/if-archive/games/competition96/jacl/eldor.ps cannot actually be dated 1979!)

4 Likes

Interesting, thanks for the info and the charts!

I was curious what caused the spike in added files in 2015, and it seems that the explanation is that the IF Comp 2015 entry “Growbotics” comprises 971 files (images).

Just for fun, here’s a quick-and-dirty Python script (working with the Master-Index.xml), which gives the directories sorted by filecount in descending order:

import xml.etree.ElementTree as et

tree = et.parse('Master-Index.xml')
root = tree.getroot()

dirfilecounts = {}

for d in root.findall('directory'):
    dirname = d.find('name').text
    filecount = int(d.find('filecount').text)
    dirfilecounts[dirname] = filecount

print(sorted(dirfilecounts.items(), key = lambda item: item[1], reverse = True))

The top 10 results are:

  1. (‘if-archive/games/competition2015/GROWBOTICS’, 971)
  2. (‘if-archive/games/zcode’, 730)
  3. (‘if-archive/solutions’, 463)
  4. (‘if-archive/games/twine’, 460)
  5. (‘if-archive/games/competition2014/Milk Party Palace/MilkPalace_Finished_v132_Data’, 430)
  6. (‘if-archive/games/pc’, 419)
  7. (‘if-archive/games/appleII/eamon/guild/dsk/dos33’, 264)
  8. (‘if-archive/games/tads’, 253)
  9. (‘if-archive/rec.arts.int-fiction’, 249)
  10. (‘if-archive/games/appleII/eamon/guild/original/dos33’, 248)
    tied with
    (‘if-archive/games/glulx’, 248)

(This does not take into regard directories where a lot of files might be spread out over a lot of subdirectories with only a few files in each subdir, but as I said, it’s just for fun and quick-and-dirty.)

If we had archived URA Winner in its original state, it would have more than doubled the number of files overnight. So we didn’t.

(There are scripts that iterate through all the files, generating Master-Index.xml and so on. We’re not going to declare a hard limit, but it’s worth keeping an eye on the file list and avoiding unnecessary bloat.)

1 Like