Static Wikipedia

Hey yall, thanks for your awesome work on this project!

I’ve got the static Spanish version of Wikipedia up and running on a TP-Link router, (Which is unable to run anything non-static) but it’s unfortunately woefully lacking in content. Is there any documentation available for the tools used to make it? I would like to make a version with more content if at all possible.

1 Like

Hi @nixcamic – it’s actually created by a fully separate nonprofit - http://schools-wikipedia.org/

That website seems different, it doesn’t seem to have the search function that the Rachel module does (Although does have the Title Index pages which would work as well) , and seems to only be in English. Also the download link doesn’t seem to work at all.

Can you tell me which module you’re talking about then? I thought for sure that was it. Our static versions in other languages come from Kiwix: http://dev.worldpossible.org/cgi/rachelmods.pl

http://dev.worldpossible.org/cgi/viewmod.pl?module_id=94 Is the module in question. It doesn’t seem to be Kiwix cause it’s completely static HTML and Javascript, and it doesn’t seem to be Wikipedia for Schools because the layout and article selection is completely different.

Super interesting, I didn’t realize we had this. It looks like someone web scraped kiwix’s spanish wikipedia. would just have used wget for command line and a javascript search index. Not sure how, it was volunteer created - @jfield @Steve

Is there any way to find out which volunteer? Their github account or email address or twitter or anything? Because I’d really like to update it, it works great but has the weirdest selection of content.

Hi - I’m the guilty party :slight_smile: – glad you found it interesting, at least!

As @jeremy guessed, I scraped Kiwix’s Spanish Wikipedia up to a point in the interest of making a fully-static version for USB. Yes, it is lacking, but in the context of a searchable USB version, that was a necessity. I limited it to content that was within two levels of every category page I could find (manually compiled from the live Spanish Wikipedia). I used wget for this against a locally hosted copy of Kiwix’s Wikipedia in Spanish.

The search is custom built in javascript on top of lunr.js – the search is the primary limitation as it can only reliably search a few MB index before javascript starts choking for many client systems. It does a title only search, and even so this content pack is near the upper limit. If you can’t search it you can’t find it, so I don’t know how much point there is in increasing the size.

The code for all this is not in github as it involves lots of manual steps and will probably confuse most people. I can put it up if you think you’d want to work with it, but it’s really not intended for public use and I can’t offer much support!

Hey if you could put it up that would be awesome, at least as a starting off point that I could work out from that would be great! If you don’t feel like its all ready to go up on github if you wanted to just post it here or email me that would be cool too.

Thanks!

I think Kiwix may have a smaller Wikipedia version of the 10,000 most read articles or something like that. I am 90% sure there is an English version of that, or that Kiwix would be willing to make it for you @nixcamic

It doesn’t look like Kiwix has a reduced-sized Wikipedia in Spanish unfortunately. They have versions without pics or videos, but that doesn’t address the problem we have which is too much text content to search.

I should ask specicially - what kinds of content are you searching for and not finding? If I get to making another one of these, I would love to know which topics of interest to add in.

That said, here is the collection of scripts I used to put it all together, including a README.txt file that gives an overview. I wish you luck, but I warn you that it is unlikely to work without lots of tweaks and experimentation. Sorry!

2 Likes

Awesome Thanks! I’ve been swamped the last few days but I’m gonna try and get into it this week!

That link http://dev.worldpossible.org/cgi/rachelmods.pl doesn’t work. I would like to add the static “Burmese” wikipedia and Burmese Wikitionary.

Please go here instead: http://oer2go.org/

I thought rachelmods.pl is the perl script to add a module. Yes, I’ve already been to http://oer2go.org and found some modules.

I would like to add ‘Burmese wikipedia’ there.

Thanks,