Raspberry Pi offline Wikipedia

Wikipedia is a vast archive of knowledge and information we tend to forget is there. An encyclopedia of knowledge brought by users and edited by a community, it has a high accuracy rate and information on just about any subject you could want. You can also download an entire archive of it at around 90GB at the time of this writing!

I’ve had the idea for a while now about making an offline version to run locally for myself or friends, maybe something just to browse during a flight or roadtrip. Or, as my prepping thoughts say, maybe something for when the power’s out and easy to access! Enter the Raspberry Pi, a low cost and low power computer to run this using a suite of tools you can run off a battery pack and access from one’s phone/tablet/computer. Well, this is easier than you might think! I’ll be going over the ideas and thought processes of this at a high level as the project took some time. I can provide more details if you’d like by reaching out to me at if you’d like.

The goals of this project were as follows:

  • Use a Raspberry Pi to run this off a battery pack for several hours at minimum
  • Must be 100% self contained; This should be able to boot, run and provide access without user input
  • Access must be simple; In this case, a self created Wifi network hosted from the Raspberry Pi
  • Small, easy to travel with. For this reason I went with a Raspberry Pi Zero W. One of the smallest Raspberry Pi SOC computers that exists (about the size of a large flash drive).

Starting with the basics: Raspberry Pi Zero W. This is a single board computer a little bigger than a flash drive which can be powered by a small USB batter pack and a micro-USB adapter. I installed a 128GB micro-SD card and flashed an image of Raspberry Pi OS onto it (an ARM branch of Ubuntu server).

The next steps were to download a suite of tools called “Kiwix Tools”. This neat set of applications allow you to host a downloaded archive of Wikipedia, providing users a simple web interface just like Wikipedia. Once done, I could then access this over my local network.

Next up was the user hostapd/dnsmasq to build a Wifi network using the onboard wifi chipset to connect to the raspberry pi, provide DHCP and resolving locally which allowed me to connect via any device with wifi. I used a tablet to configure and confirm this.

For the last portion, I installed a LAMP stack onto the system (Linux + Apache + Mysql + PHP) and installed a copy of wordpress. I wanted a way to easily write notes into a webpage for anything to see when accessing this. Information, notes, ideas, etc, this was the easiest way to write and view this. It’s surprising how well this runs off the little system too!

In the end, I have a small ultra low power server essentially which allows me access to a vast amount of information on an almost endless supply of topics, along with WordPress for notes and further information I want to add!

In the future, I may build HTTRACK onto the system to rip websites for offline viewing as well to have even more access and information offline as I want. This was more of a proof of concept system but I’m quite proud of how it built out. Anyways, I hope you enjoyed my rambling and thoughts. Cheers!

RG350P Handheld First Impressions

After saving up for this handheld, I pulled the trigger on this recently and have been pretty impressed overall. The system runs buttery smooth, comes preconfigured with a large number of emulators and ROMs and just generally feels like a solid little handheld from the hay-day of Gameboy goodness.

Running at 100 bucks with a micro-SD card, the system isn’t super expensive when compared to my Nintendo DS with a custom Kart to run roms (about 90 bucks total and much more limited in what it can do in comparison). This handheld even plays PSX roms without issue and has excellent loading times when compared to my playing times as a child.

Things of note:

  • Quick to boot. Plays games very well once you find the right roms. Surprised it plays PSX roms as well as it does (Vigilante 8, Driver, Crash Team Racing each for 30 mins or more without issue).
  • The system will randomly not boot (stuck on black screen) but hitting the reset button immediately resets and fixes it.
  • Came with 32GB MicroSD card loaded with ROMS which makes managing them much easier. Comes with a bunch of emulators (including DOSBOX which is neat) and all are preconfigured and just work which is excellent.
  • Lots of menu options that make it pretty solid to use. Can also save game states to come back to them if the games don’t support saving (very neat feature).
  • Worth 100 bucks? I’d say so for sure, especially with the loaded SD card.

I’m going to be spending a lot of time tinkering with this. It appears it supports networking (guessing via one of the two USB-C ports) so will need to check that out for easier ROM management off my internal SFTP server. In the meantime, I’ll be reliving my childhood on a 4 inch screen.

unRAID: capacity and ease of use over performance

I’ve been looking over various NAS (Network-Attached-Storage) operating systems for some time now. Naturally, there’s two big players in the game that everyone seems to go to: FreeNAS and unRAID. Both boast a considerable user base, community add on support and a ton of customization but one big difference at a quick glance: FreeNAS, as the name implies, is free while unRAID is a pay for licensed OS. But a quick glance only shows so much.

After spending several months going back and forth, I decided to do some testing with unRAID. One of the biggest reasons was my mix of various extra hard drives I have that I wanted to use in the pool for the software RAID configuration. FreeNAS requires matching disks in pairs and I have odd sets of drives ranging from 4, 8 and 12TB capacities. I initially did some testing on an old 2U with 6 X 1TB disks to test and get used to the GUI. I then upgraded one of the disks in the array to a 2TB disk to see the process. Spoiler: stupid easy and straight forward, exactly what I want. It was time to go big on the build.

I purchased a Dell R510XD server for this project: 32GB ECC RAM, twin 6 core Xeons and 12 bay capacity; Perfect amount of drive bays and overkill on CPU and RAM for future proofing. Unfortunately, this was the beginning of a bit of a tough learning process…

Being new to software RAID, I forgot to take into account the hardware RAID card. The H700 card onboard does not support JBOD (Just a bunch of a disks) which allows an operating system to see ALL individual disks and build the software raid from this. I had to bite the bullet and order another RAID card and cables that would support the proper config. 50 bucks later, I was in business.

The initial configuration was this: (2) 12TB disks for parity, (4) 8TB disks and (5) 4TB disks for the storage pool. With the dual parity disks, this allows up to 2 disks to fail without data loss. The initial RAID parity burn in took about 30 hours which isn’t bad over all. Unfortunately, I soon found the write speeds with the software RAID to be less than stellar, something unRAID is known for. I took the next step of adding a 1TB SSD as a cache disk to mitigate this issue and can now sustain gigabit throughput on uploads without issue.

Onto the software side of things, I’ve added a few of the usual plugins (Community Applications, Calibre, Plex and others). The installs take all of 30 seconds and typically run with a dedicated docket instance, something I’ve never tinkered with prior but am quickly falling in love with for its simplicity and ease of maintenance. The software RAID seems robust, the GUI is sleek and modern and everything is snappy and well laid out. I went through and upgraded capacity replacing one 4TB disk with an 8TB (about 20 hours to burn in) and this again was quick and painless.

One quick thing of note: One of the biggest differences besides the disk loadouts between unRAID and FreeNAS is the performance. FreeNAS boasts considerably higher read/write speeds due to the way the parity works (excellent video summarizing this here). The other is that changing the array (modifying disks, adding, removing, etc) takes considerably more work and effort including CLI management of disks. As someone who’s broken a number of *NIX systems on the CLI, this was a bit of a deal breaker for me. Another difference because of the disk management being different: You can add just ONE more disk at a time to unraid, whereas FreeNAS requires matching pairs to work.

All in all, I’m shocked at how well this project has come together. With the current config, I’m at 56TB raw, 51.8TB usable capacity. The system is used both as a file dump for all my stuff and as a redundant backup system from several other systems due to its capacity. I would definitely recommending trying the software out for free and see how you like it and if it’s for you or your business.

Quick take: Slower than FreeNAS, more capacity, make sure you have a JBOD support RAID card or direct pass through on SATA.

Automated Youtube Downloads Into Plex (Windows)

Welcome to another Overly Complicate Project! This time, it started with some advice from our friends at r/DataHoarder and a fun tool called “youtube-dl”. This has taken a bit of tinkering and some custom code, but I now have an all-in-one solution that downloads Youtube videos from a playlist/channel, confirms progress to save bandwidth on future downloads, and stores them into a Plex library for local viewing. Let’s begin.

I first started with a VM running Windows Server 2019. Following the steps below, you can install the WSL (Windows Subsystem for Linux) and have a full Ubuntu/Linux shell to use. I chose Ubuntu 16.04LTS as this is my favorite version of the server software.

https://docs.microsoft.com/en-us/windows/wsl/install-on-server

After installing WSL, I ran the below in the BASH/Ubuntu shell to install Pip, ffmpeg (used for video conversion by youtube-dl) and youtube-dl:

sudo apt update
sudo apt install python-pip ffmpeg
sudo pip install youtube-dl

This will install all necessary packages needed and get you into a running state for youtube-dl. For my server, I have a 200GB boot disk and a 10TB secondary disk. So, opening the bash shell and changing to that disk, I made a folder called “youtube” to store all my videos into. As a test, you can run youtube-dl against a video of your choosing to confirm everything works. This is a basic command I use for everything which sets the filename, retries, progress file, etc:

youtube-dl -o '%(playlist)s/%(title)s.%(ext)s' --format bestvideo+bestaudio/best --continue --sleep-interval 2 --verbose --download-archive PROGRESS.txt --ignore-errors --retries 10 --write-info-json --embed-subs --all-subs YOUTUBE_URL_HERE

To break this down:

  • -o: Output of Playlist (channel as well)/Title.Extension of file
  • –format: Best video and audio on the requested video
  • –continue: continue if this was interrupted at last download progress
  • –sleep-interval: sleep for 2 seconds between steps
  • –verbose: verbose output
  • –download-archive: track progress of downloaded videos to save time and bandwidth in PROGRESS.txt file
  • –ignore-error: ignores errors and keeps processing
  • –retries: retries when error 404 or similar found
  • –write-info-json: output JSON string with information about video
  • –embed-subs/–all-subs: use ffmpeg to burn in subs into video

The above is what I use for everything except 4K channels which are just too much space to hit in bulk. Using this, I tossed several of those commands into a .sh file such as this:

youtube-dl -o '%(playlist)s/%(title)s.%(ext)s' --format bestvideo+bestaudio/best --continue --sleep-interval 2 --verbose --download-archive PROGRESS.txt --ignore-errors --retries 10 --write-info-json --embed-subs --all-subs YOUTUBE_URL_HERE
youtube-dl -o '%(playlist)s/%(title)s.%(ext)s' --format bestvideo+bestaudio/best --continue --sleep-interval 2 --verbose --download-archive PROGRESS.txt --ignore-errors --retries 10 --write-info-json --embed-subs --all-subs YOUTUBE_URL_HERE
youtube-dl -o '%(playlist)s/%(title)s.%(ext)s' --format bestvideo+bestaudio/best --continue --sleep-interval 2 --verbose --download-archive PROGRESS.txt --ignore-errors --retries 10 --write-info-json --embed-subs --all-subs YOUTUBE_URL_HERE
youtube-dl -o '%(playlist)s/%(title)s.%(ext)s' --format bestvideo+bestaudio/best --continue --sleep-interval 2 --verbose --download-archive PROGRESS.txt --ignore-errors --retries 10 --write-info-json --embed-subs --all-subs YOUTUBE_URL_HERE
youtube-dl -o '%(playlist)s/%(title)s.%(ext)s' --format bestvideo+bestaudio/best --continue --sleep-interval 2 --verbose --download-archive PROGRESS.txt --ignore-errors --retries 10 --write-info-json --embed-subs --all-subs YOUTUBE_URL_HERE

Then, simply run in your BASH application: sh YOUR_SCRIPT_NAME.sh

Cool, right? Now you have a script with all your commands to download your favorite channels or unlisted/public playlists. One of the cool things with the integration of BASH/Windows is now you can also make a Windows BATCH (.bat) file to launch this. Making a .bat file called whatever you want (runme.bat is my favorite), put in to call the script you built, first changing to your directory of YOUR script to properly run:

@ECHO off
echo Launching youtube download helper script in BASH...
timeout /t 5
bash -c "cd /mnt/e/youtube/;sh YOUR_SCRIPT_NAME.sh"
echo Completed!
timeout /t 60

Neat, now you can single click the .bat file to launch your downloads! BUT, there’s something else you can do now: Windows Task Scheduler. Go into “Task Scheduler” in Windows, and create a simple task. In this, set it to whatever time you want and have it run daily/weekly/however you’d like. I have an example here of the one I use:

Have this simply run your BATCH (.bat) file and it will now fire off automatically as you requested. This completes your automated downloads portion. I took this a step further, however, because a lot of my favorite music and music videos get taken down frequently and I wanted a nice, simple way to search and watch/listen to them. Enter Plex.

Grab a copy of Plex server from online and install on your Windows system. Having some horsepower here is definitely recommended: minimum quad core and 6GB+ of RAM (I’m running 8 cores and 12GB due to extra processing of 40K+ videos).

Under Plex, configure a Library of “Other Videos” and search for your top directory where your Youtube videos will be downloading from. It will then scan and add any videos it found by name to make searching easier for the future. I also went into my Plex server options and configured it to check the library for changes every 12 hours or so to catch anything downloaded overnight. I wake up in the morning and my newly downloaded videos are processed and ready for viewing!

I hope you find this interesting and informative. This has been a long project and has gotten very complicated as I built a Perl based wrapper to automate more of this. I encourage you to tinker and make this more effective and easier for your specific situations. Maybe a wrapper script to pull URLs from a file? Good luck and happy tinkering!

WordPress permalinks issues

After running into this issue once more with a fresh WordPress install, I’ve found the only way to use custom permalinks is to have them custom set up like this:

Go under Settings > Permalinks

Click on “Custom Structure” and insert like this:

/index.php/%year%/%monthnum%/%day%/%postname%/

After hours of google searching with people saying anything from disabling plugins (there were none) to reverting settings (it was a NEW site with no posts prior to changes), this is the ONLY way I’ve gotten the custom WordPress Permalinks to work in the manner I wanted. Hopefully this saves others time/frustration.

Archiving youtube and website data

YouTube has become a bit of a dilemma for many people like myself who enjoy music and video edits with said music; We love supporting artists we enjoy along with the video edits. But, with companies locking down on content, these videos and channels are going offline suddenly and often without warning. I’ve taken to downloading backups of these as often as possible. With a little help from r/datahoarding, I now have a great set up that does this with minimal user intervention.

The fine folks over at r/datahoarding swear by a tool called “youtube-dl”. For an example install on an Ubuntu WSL in Windows:

 sudo yum install python-pip ffmpeg
 sudo pip install youtube-dl 

Then it’s just a matter of providing content to download:

 youtube-dl -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s'  --format bestvideo+bestaudio/best --continue --sleep-interval 10  --verbose --download-archive PROGRESS.txt --ignore-errors --retries 10  --add-metadata --write-info-json --embed-subs --all-subs  https://www.youtube.com/channel/UCuCkxoKLYO_EQ2GeFtbM_bw 

This will output everything from the channel its own directory (in this case “Uploads from Half as interesting”), sleep 10 seconds between downloads, store info/subs and store progress to prevent excessive traffic attempting to redownload videos. This is running on a dedicated system now called from Windows Task Manager once a week. The bonus is I have several playlists to download that I simply tag into whatever playlist I choose and the videos are download automatically in the background for future perusal.

Now, what about backing up an entire website/directory/open directory? Well, there’s a handy tool for that too: wget

Over at r/opendirectories (I love Reddit), the lads and lasses there have found some great data/images/videos/music/etc and it’s always a rush to get those downloaded before they’re gone. In some cases, it’s old software and images; Other times it’s old music from another country which is interesting to myself and others. In this case, again using the Windows Subsystem for Linux (WSL), you could do similar to below:

/usr/bin/wget -r -c -nH  --no-parent --reject="index.html*" "http://s2.tinydl.info/Series/6690c28d3495ba77243c42ff5adb964c/"

In this case, I’m skipping downloading the index files (not needed), the “-c” flag continues where it left off, and it downloads everything from that directory. This is handy for cloning a site or backing up a large amount of items at once. This can run for days possibly and can choke on large files (I’ve only seen issues with files over 70GB; Your mileage may vary) but this has worked well so far. I now have a bunch of music from Latin America in a folder for some reason.

What are your thoughts? Do you see a lot of videos missing or being copyrighted? Do you have a better way of doing this? Let me know!

My choices for browser addons

A web browser is something everyone uses but no one really thinks about. Sure, some people prefer Chrome or FireFox (myself being in the latter), or some even stick with the MS choice of Edge or IE. But what a lot of people don’t know, is there is a myriad of add ons, themes and plugins that make them so much more than just a browser. Some of these addons also provide extra layers of security. That’s where today’s discussion will be: The addons I run for security and privacy and what they do.

To start: Most of the addons I’ve looked into work on both Chrome AND FireFox. I recently switched back to FireFox after about a decade of chrome. Between the slowness, the RAM consumption and the amount of privacy issues, I’m glad I made this choice.

The first addon I’ll always run on ANY web browser is Adblock Plus. This should be a staple for everyone and anyone using the web and provides the first layer of security and privacy. This will knock out about 95% of ads in my experience with it and will greatly help your browsing experience. This also has a bonus perk: A lot of malware seen in the wild comes from bad or unsecure ads on websites so this will be a nice line of defense against these.

The second one I run is Privacy Badger. This prevents websites from tracking your progress site to site and will prevent such things as Facebook from monitoring your browsing after you leave a web page and continue on. The biggest surprise for this plugin was the sheer amount of trackers on news sites (localsyr.com is a great example). It’s a pleasant feeling seeing the numbers pop up and seeing how many trackers are stopped.

Next up is uBlock Origin which takes the role as a duplicate ad blocker as a second line of defense against ads. Per their website from the link above: ” An efficient blocker: easy on memory and CPU footprint, and yet can load and enforce thousands more filters than other popular blockers out there. ” I run this as a redundant blocker and it does seem to catch some of the items that Adblock plus might not hit.

Last, but not least: LastPass. I used a Google sync account for years to sync all my saved passwords and forms. In hindsight, it’s incredibly insecure and a terrible idea; Stealing someone’s unencrypted computer would allow extremely simple access to all of this information including passwords, logins and site history. With lastpass, this is all stored in an encrypted space which prevents someone with local access getting into it. I’m still testing it but so far it’s been a bumpy but improving ride. I look forward to being able to continue testing it.

Bonus add on: Disable HTML5 Autoplay. Almost ALL news sites (looking at you again localsyr…) have videos automatically play and scare the crap out of anyone not ready for a loud blast of audio. This is a huge pet peeve of mine and this app has been an absolute godsend for browsing.

What addons do you run? Have you tried these before? Thanks for reading and cheers!