Remember the BBC Sound Archive from a while back?

here's how to download them all.

GETLIST=`curl -s bbcsfx.acropolis.org.uk/assets | awk -F '"' '{ print $2 }' |
awk -F '.' '{ print $1 }' | grep -v location`
for i in $GETLIST; do
wget bbcsfx.acropolis.org.uk/assets
done

aschmitz @aschmitz

@wohali I think if you pass that as a list to wget, you benefit from keepalives and pipelining, which might be handy given that it's a *lot* of small files.

· SubwayTooter · 0 · 0

@aschmitz good point, care to suggest a modification? my brain hurts.

@wohali Eh, it's bad and untested but:

GETLIST=`curl -s bbcsfx.acropolis.org.uk/assets | awk -F '"' '{ print $2 }' |
awk -F '.' '{ print $1 }' | grep -v location`
rm uris.txt
for i in $GETLIST; do
echo bbcsfx.acropolis.org.uk/assets >> uris.txt
done
wget -i uris.txt

In theory you could run that file though parallel or something too, but I wouldn't want to hammer the server too much.

@aschmitz thanks! yeah, you wouldn't want to get auto-blocked, either... :)