Hacker Newsnew | past | comments | ask | show | jobs | submit | kyleleelarson's commentslogin

My immediate (incorrect) association upon seeing the words "Dark Star" on HN was with the Grateful Dead song, which is about 56 years old. The Live/Dead version was a true epiphany for me, now 18 years ago. After really only liking the late 60s Dead for about a decade I have apparently aged into liking their later stuff. There is a nice live video of Dark Star from 1974 on YouTube: search "Grateful Dead - Dark Star (Winterland 10/18/74)".


I am curious why the term "fiber optic" seems to have declined in popularity, as least when it comes to big companies' annual reports: see https://searchsecdata.com/search?stockindex=S%26P+500&search...


www.searchsecdata.com, like "google trends" for 10-k filings that public companies submit to the SEC. Currently supports full-text search on almost all 10-k filings for current S&P 500 and Russell 2000 companies for the last 20 years.


Good stuff! Few suggestions:

- add year range, like 2019-2024

- bold searched word in results; italic is not so visible

- custom table results per page (25, 50, 100)

- export to csv of all results in table

Did you scraped all text into database, or search through bunch of text files?


Thanks for the suggestions! Yeah I scraped the annual reports from the SEC website, and built a database of the results. Lots of malformed xml to deal with


I am working on a somewhat similar project, for searching items 1 and 1a in 10-k annual reports, that I am hoping to release in the near future. I would be interested to hear what lessons you end up learning about scaling up to handle the interest you got from HN.


Definitely limit the more disk heavy features, or spend more time (and money) on infastructure. I was running the whole site on an 8vCPU 24GB RAM VM, and it almost immediately crashed due to the high disk reads.

This is likely due to the fact that the database is huge, and providing that data on demand is very resource intensive- especially when there are forty different people sending many requests a second.

If you don't want to compromise on data though, look into spending a little bit more time/money on infrastructure. I wish I had deployed the project on Kubernetes, instead of what I ended up doing.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: