The combination of interesting files one can find on public FTP servers plus the technical expertise required to make a decent search engine motivated me to write Findex and ultimately this blogpost.
In short, I wrote a FTP/HTTP/SMB crawler that can traverse/index large amount of servers. I pointed it towards every single IP (port 21) in The Netherlands in order to:
- ensuring that my indexing software can handle large amounts of data
- see what I can discover
This article is by no means presented as ‘new’. However, given the fact I was still able to collect enormous amount of private files I’d say this deserves some attention.
- Although not illegal, mass port scanning is a great way to get kicked off your ISP. I would not recommend doing that from your home connection.
- Indexing/crawling many FTP servers might also not be to the liking of your ISP.
- Traversing public FTP servers is not illegal, however, it is the reader’s responsibility to obey all applicable local, state and federal laws.
- I merely indexed the servers, not downloaded individual files.
- This article is presented as research.
As previously stated, I decided to concentrate on public FTP servers located in my own country, The Netherlands.
I used a list of IP blocks belonging to Dutch internet providers and started my scan. Due to the fact Findex can do distributed scans and crawls it only took half a day.
It resulted in 257807 discovered FTP servers of which 7578 required no form of authentication. I filtered the list of servers that did not contain any files and got to 2359 public FTP servers. From those I was able to discover 18.088.392 files, so, a little over 18 million.
I now had indexed every single file stored on a public FTP server located in The Netherlands.
- 257.807 FTP’s
- 7578 public FTP’s - 2.9%
- 2359 public FTP’s containing files - 0.9%
- 18.088.392 files
- 438.994 terabyte
I forgot to look at write permissions, so unfortunately I do not have these statistics for you.
Domains - top 13
All entries in the following list are Dutch ISPs except for the 4th place which seems to be a hosting company. Servers there probably come with a default public FTP account.
I did a lookup on all the IP addresses for their physical location.
The following table shows the distribution of file categories.
SELECT file_format,sum(file_size),count(*) FROM files WHERE file_isdir != TRUE GROUP BY file_format;
Surprisingly, 28% of all the files collected were pictures (5 million!). Mostly private photographs…
File Extensions - top 10
Next up are the discovered file extensions.
SELECT file_ext, count(*) FROM files WHERE file_isdir != TRUE GROUP BY file_ext ORDER BY count(*) DESC LIMIT 10;
SELECT count(*) from FILES WHERE file_isdir != True AND file_format=1 AND searchable like 'keyword%';
|‘wachtwoord’ and ‘password’||396||‘wachtwoord’ means ‘password’ in Dutch. Text documents came up with lists of passwords|
|passport||192||Images and documents of passports|
|belastingaangifte||517||‘belastingaangifte’ means ‘tax return’ in Dutch. Tax documents came up.|
|‘factuur’ and ‘invoice’||4544||‘factuur’ means ‘invoice’ in Dutch. A lot of invoices came up.|
|creditcard||139||Photos and documents of creditcards|
|gemeente||614||‘gemeente’ means ‘local authority’ in Dutch. Goverment related documens came up.|
|wp-config.php||32||Configuration file for Wordpress|
|configuration.php||61||Configuration file for Joomla|
|config.php||428||Configuration files for various other web applications|
|passwd||82||Information file about users on unix systems|
I viewed a few of the files and they were indeed what the filenames depicted.
Obviously I will not publish any of these documents or pictures. But I will also not notify the affected parties in question for the following 2 reasons:
- Too many hosts
In this particular scenario I would need to alert 2500+ hosts on their lack of properly securing their FTP server. For me this is not realistic. It would take too much time and I would rather not police the internet myself. Instead I hope that this research will spread awareness.
Many public FTP servers on the internet are still hosting sensitive files, in the year 2015. I had the ability to download a wide variety of sensitive documents and most surely other people are doing this too.
The software used can be found on my projects page. It is called Findex.