This repository contains the SQLite schema and scripts to handle financial data. It takes a generic approach to assets, transactions, balances and other things providing heavy automation capabilities, extensibility & deep analysis opportunities.
* **Schema**. Spreadsheets are terrible. They're made for structured data although it is nearly impossible to structure data in them. A spreadsheet is literally a set of cells with their own rules, no decent data validation mechanisms, no entity relationship, and schema is stored entirely in the user's mind.
* **Interface.** You can easily render graphs on HTML pages. "DB Browser for SQLite" can be used to edit data. SQL is very convenient for analytical/bulk update purposes, for scripting such as automatic data import. Queries & DDL have great readability. And you may find other ways to interact with the database.
* **Performance, limitations, control.** This project can handle millions of rows, while it's simply impossible to store them in Excel. Indexes are a basic thing for a database. The user has wider control over query execution.
* **Portability.** SQLite runs everywhere, it's fast, lightweight, free, secure and does not need an internet connection.
### What to start with?
`fin_transactions` and `balances` are core tables that need regular updates. They will lead you to other important things. For convenience, see the views.
Below you can find the short summary with usage notes for each table. Not all details are described. For the complete structure please see [DDL](./schema.sql).
Generally it is desired to know the exchange rate of an asset right before transaction or balance snapshot, although one may also use deep historical data for the analytical purposes.
A **transaction category** describes the logical sense of the transaction.
Transaction categories must form a hierarchy with only one root. If a certain flag (starting with `is_`) is set to true on a category, it must be set to true on all of its child categories. The integrity is ensured via triggers that throw exceptions.
is_passive (boolean as integer not null) - see below
is_initial_import (boolean as integer not null) - whether the transaction is an upload of the existing assets for accounting
parent_id (fk fin_transaction_categories) - reference to a category that is a superset of the current one
min_view_depth (integer not null) - in the flattened representation, category shall not appear on a level lower than N, N>=0. Parent category takes multiple levels instead
> For example, paying tax for the property a person lives in is not a passive loss, although paying it for a property that they lend is a passive loss.
Any transaction of a category with `is_passive` must link to a `reason_*` asset, because conditions for setting a `reason_` on transaction are a subset of conditions for `is_passive` of a category. Although a transaction with `reason_*` may be classified as non-passive if it implied some non-monetary gains or losses. This distinction might be useful in evaluation of the overall impact of the asset.
> Examples of transaction categories: expense, salary transfer, rent payment, self transfer or exchange, dividends payout.
### balances (table)
Stores historical balances.
Balance is a verified snapshot of the amount of some asset stored at an asset storage at a given time. Therefore, balance entry may appear anytime without prior transaction activity, and transaction does not create an obligation to update balance right after it happened. Balance is consistent with transactions as long as transaction delta before balance `datetime` equals the balance value: `[balance] = [sum amount of txs where tx datetime < balance datetime]`. Amount can be negative.
One may argue that storing balances separately is a bad practice because it causes denormalization and data inconsistency. However, it is a deliberate choice to store conflicting data, as such data is loaded from external sources. Purpose of this project is to offer a viewpoint on multiple versions of data in order to resolve these conflicts.
A **balance goal** is a plan to have a certain amount of financial assets on a balance for the specific purpose in the future, or keep it there constantly.
It is possible to have multiple goals per asset & storage - to complete all of them balance must be equal to or greater than sum of individual goals. Their progress is counted one by one according to the `priority`.
deadline (datetime as text) - shows last desired date of completion
result_transaction_id (fk fin_transactions) - if saving resulted in a transaction, that transaction can be linked here. Such a goal will be considered complete
start_datetime (datetime as text not null)
end_datetime (datetime as text) up to that moment goal is relevant, always relevant if set to null
This table sets a goal for the financial asset distribution.
Each asset & storage (`fin_assets_storages`) from your portfolio can reference a specific allocation group. Calculation should be performed based on `fin_asset_rates`.
`target_share` may be any valid number as it shows proportion. Equal target shares indicate that values of the underlying assets should be the same. For a negative target share, there may exist a positive one with the same value that would compensate corresponding debt. Normalization should happen at a later stage.
Represents real-world assets, purchases and other non-fungible (non-interchangeable) things. The intended use case is to track large and important assets, especially ones that generate passive gains and losses.
Provides double-entry bookkeeping for the operations where both sides are tracked. Swap is an internal transfer of value that may happen between same or different assets, possibly of different nature. Swap changes the value allocation between financial accounts or physical items.
If you edit `balance`, a supplied one will be saved with a date of the current day. Upon the insertion of a new row, a record in `fin_assets_storages` is created if needed and the balance is upserted.
For inserts, there is also one special column `adjust_balance` - it allows to auto-update `balances` with the amount of the current transaction. In such case, a new balance entry with datetime one second after transaction will be created or updated. Works only with current datetime. **Please be cautious: while this option is convenient, used wrongly it may mess up your balance. Verify balances.**
category (text lkp fin_transaction_categories not null)
reason_phys_asset (text lkp phys_assets via phys_asset_ownerships at datetime)
value tuples:
1.
asset_type (text lkp fin_asset_types.name not null)
asset_code (text lkp fin_assets.code not null)
storage (text lkp fin_asset_storages.name)
2.
reason_asset_type (text lkp fin_asset_types.name not null)
reason_asset_code (text lkp fin_assets.code not null)
reason_storage (text lkp fin_asset_storages.name)
info:
asset_name
special:
adjust_balance (boolean as integer, insert only) - pseudo-column, if set to true current operation will be auto-reflected in balances. Works only if transaction has datetime of the current moment
This view contains a row only if there is a mismatch between transaction delta and balance delta during the last 2 years. It is advised to keep this view empty via adding missing transactions or adjusting balances.
Goals that result in financial transactions are hidden. Goal is considered accomplished if there are enough corresponding assets on the balance. It is reversible, thus needs your attention. Accomplished goals are listed at the bottom of the view.
Goal is considered irrelevant and thus not shown if it either resulted in a transaction or the current moment is outside of the goal's datetime range.
current_share - calculated as balance / sum(abs(balance))
target_share
```
Sum of `current_share` percentages without sign is always 100. However, negative balance leads to a negative share. Thus the real sum may vary from `-100` to `100`, where `100` means that all accounted balances are positive, `-100` means they are negative, `0` means that sum of negative balances equals sum of positive balances multiplied by `-1`.
* Editable views have a `pseudo_id` column with unique non-null values so that client software can identify which row is being edited.
### Performance
* Expect gigabytes of data, do tricks such as force materializing.
* Do not use a view as a part of another view.
* The primary query is `SELECT`, so it makes sense to create many indexes.
* Always index foreign keys, remove indexes that are part of other covering indexes.
* Rely on internal indexes produced by primary keys and `UNIQUE` constraint
### Naming
* Snake case for identifiers, no prefixes except for indexes.
* Index name must be `i_[table_name]_[column names joined by "_"]`.
* The last noun in the table name is pluralized.
* Many-to-many (join) tables are named with a combination of tables they join. Although if referred tables have the same name prefix, it shall be used once. The table that usually has more records comes first.
* Foreign key column names must end with `_id` and should point to the primary key. Their names must meet the table context and usually do not need extra database-wide prefixing.
* Names of the columns that store boolean must start with `is_`
*`code` - string identifier that is required, unique in some way per table case-insensitively and contains no spaces. `code` is a static thing used to identify rows externally upon data edit.
*`name` - identifier for the display purposes, that can be edited anytime