Hacker Newsnew | past | comments | ask | show | jobs | submit | firedup's commentslogin

I’m a quant who taught ML in finance at NYU for 4 years, and have spent ~6 months building an AI workflow that produces long-form equity primer reports. Not selling anything; I’m looking for blunt feedback from analysts who do equity research for a living.

What it does (brief)

    Pulls public sources (filings, transcripts, ownership/insider, patent/R&D mentions, sentiment).

    Outputs a structured primer (business model, KPIs, financials/ratios, risks, comp set, pipeline/innovation, catalysts).

    All points are citation-backed to public links; no paywalled sell-side.
Where I need your critique

    Biggest gaps vs. credible sell-side/independent work?

    Sections you’d cut/condense for real-world usability?

    Where would you not trust automation without a human pass?

    Preferred output format: single PDF/HTML vs modular notes?
Sample (mods: if links aren’t allowed, I’ll remove): RIG (Transocean) primer (HTML): https://storage.googleapis.com/derek-snow-at-outlook-co-nz-p...

Happy to answer technical/process questions in the thread. No DMs, no waitlists, just trying to make this genuinely useful for practitioners.


News Sentiment: Ticker-matched and theme-matched news sentiment datasets.

Price Breakout: Daily predictions for price breakouts of U.S. equities.

Insider Flow Prediction: Features insider trading metrics for machine learning models.

Institutional Trading: Insights into institutional investments and strategies.

Lobbying Data: Ticker-matched corporate lobbying data.

Short Selling: Short-selling datasets for risk analysis.

Wikipedia Views: Daily views and trends of large firms on Wikipedia.

Pharma Clinical Trials: Clinical trial data with success predictions.

Factor Signals: Traditional and alternative financial factors for modeling.

Financial Ratios: 80+ ratios from financial statements and market data.

Government Contracts: Data on contracts awarded to publicly traded companies.

Corporate Risks: Bankruptcy predictions for U.S. publicly traded stocks.

Global Risks: Daily updates on global risk perceptions.

CFPB Complaints: Consumer financial complaints data linked to tickers.

Risk Indicators: Corporate risk scores derived from events.

Traffic Agencies: Government website traffic data.

Earnings Surprise: Earnings announcements and estimates leading up to announcements.

Bankruptcy: Predictions for Chapter 7 and Chapter 11 bankruptcies in U.S. stocks.


How do you source them, google?


google is the main source. also there are some github repositions and reddit links.


That is awesome, I have proposal for you, I currently run a niche notion website ml-quant.com, it could be much bigger and better than it is, the quant finance community is humongous and there is no central source for them, I have around 200 people on the website per day running all the important scrapers. Again this could be much bigger! I have a community both on a newsletter and on Linkedin. Let me know if this is something you would like to help me build, I would like to create a membership tier in 3-4 months. I am happy to go 50/50, you can get me on github for contacts https://github.com/firmai.


ah, interesting - similar approach it seems...


Interesting, unrelated, but related to your intro, what is the best open source datasets for maritime data?


The framework includes transformation from tensors, matrices, and vectors. It includes a range of encodings and decompositions such as Gramian Angular Encoding, Recurrence Plot, Markov Transition Fields, Matrix Product State, CANDECOMP, and Tucker Decompositions.

Github - https://github.com/firmai/datagene Colab - https://colab.research.google.com/drive/1QSDTKvNiwc1IRCX_VYr...


It seems like you can subscribe here https://mailchi.mp/380cc3ca0a61/firmai



Are Google Colab the future of reproducible research?


How does this compare to Kedro?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: