Data Manual
Important things to note: The data used in this database is from SEC's EDGAR data files. Please visit SEC's EDGAR website for more information regarding the source data. With this, here are the data definitions for CSV file download:
DISCLAIMER: Before using this database. Please note the following:
There has been limited quality assurance checks on the data because only one person maintains the data and there are over 30 million records.
Not all data periods may be available.
For the Cash Flow statement's, only Q2 through Q4, quarterly data. This data is programatticaly calculated as I could not find these values directly in SEC's EDGAR data files. For example, company Microsoft, if it's Q2 aggregated 'Cash Generated By Financing Activities' value is equal to 100 and it's Q3 aggregated 'Cash Generated By Financing Activities' was 150, then the Q3 value would equal 150-100= 50. This is because the values in the cash flow statement were aggregated each quarter in the EDGAR data files. Aggregated meaning, Q3 data in SEC Edgar database would aggregate data from Q1 through Q3.
For the Income statement's, only Q4, quarterly data. This data is mostly programatticaly calculated as I often could not find these values directly in SEC's EDGAR data files. For example, company Microsoft, if it's Q3 aggregated 'Operating Income' value is equal to 100 and it's Q4 aggregated 'Operating Income' was 150, then the Q4 value would equal 150-100= 50. Most of the values for the income statement found for Q4 in SEC's EDGAR data files were aggregated data; the data values which were non-aggregated were left alone. Q2 through Q3 income statement data also included aggregated data as well as its non-aggregated data countepart. Q1 income Statement data was not aggregated.
The only aggregated data shown in this application is FY (fiscal year) end data. The FY end data for the cash flow and income statements are aggregated values from Q1 through Q4. The fiscal year end data for the balance sheet equals Q4 data. This is the correct method to display financial statements.
For the exact cleaning process of the data please visit my github as the code there shows how the data was processed.