Financinos - Data Manual

Data Manual


Important things to note: The data used in this database is from SEC's EDGAR data files. Please visit SEC's EDGAR website for more information regarding the source data. With this, here are the data definitions for CSV file download:

  1. record_id - The unique record in the database is referenced by a unique record_id
  2. tag_sec - Retrieved from EDGAR, description for understanding the specific line item. Description provided by the SEC
  3. label_sec - Retrieved from EDGAR, similar to 'tag_sec', but this is the original description provided by the company
  4. report - The report where this data can be found (see url below, input in the url where you see [])
  5. line - The line in the report to where this data can be found
  6. stmt - Statement type
  7. data_date - Date to which the financial statement was filed
  8. fy - Fiscal year for the financial statement
  9. fq - Fiscal quarter for the financial statement
  10. qtrs - A value of '0' means it just includes one quarter (quarterly data). A value of '4' means it includes all four quarters (annual data)
  11. uom - Unit of measure, e.g. currency, shares, etc.
  12. tag_renamed - A cleaned description of 'tag_sec'
  13. value - The value of the record
  14. line_item - Underscored numbers that were aggregated in the financial statement
  15. url - The location of the financial report on the EDGAR website.. e.g.: https://www.sec.gov/Archives/edgar/data/789019/000119312510015598/R[report].htm
  16. ein_id - The unique value which identifies the business entity

DISCLAIMER: Before using this database. Please note the following:

  1. There has been limited quality assurance checks on the data because only one person maintains the data and there are over 30 million records.

  2. Not all data periods may be available.

  3. For the Cash Flow statement's, only Q2 through Q4, quarterly data. This data is programatticaly calculated as I could not find these values directly in SEC's EDGAR data files. For example, company Microsoft, if it's Q2 aggregated 'Cash Generated By Financing Activities' value is equal to 100 and it's Q3 aggregated 'Cash Generated By Financing Activities' was 150, then the Q3 value would equal 150-100= 50. This is because the values in the cash flow statement were aggregated each quarter in the EDGAR data files. Aggregated meaning, Q3 data in SEC Edgar database would aggregate data from Q1 through Q3.

  4. For the Income statement's, only Q4, quarterly data. This data is mostly programatticaly calculated as I often could not find these values directly in SEC's EDGAR data files. For example, company Microsoft, if it's Q3 aggregated 'Operating Income' value is equal to 100 and it's Q4 aggregated 'Operating Income' was 150, then the Q4 value would equal 150-100= 50. Most of the values for the income statement found for Q4 in SEC's EDGAR data files were aggregated data; the data values which were non-aggregated were left alone. Q2 through Q3 income statement data also included aggregated data as well as its non-aggregated data countepart. Q1 income Statement data was not aggregated.

  5. The only aggregated data shown in this application is FY (fiscal year) end data. The FY end data for the cash flow and income statements are aggregated values from Q1 through Q4. The fiscal year end data for the balance sheet equals Q4 data. This is the correct method to display financial statements.

  6. For the exact cleaning process of the data please visit my github as the code there shows how the data was processed.