findb/README.md

# FinDB - personal financial data model & tools


## About

This repository contains the SQLite schema and scripts to handle financial data. It takes a generic approach to assets, transactions, balances and other things providing heavy automation capabilities, extensibility & deep analysis opportunities.


### Why SQLite but not spreadsheets?

* **Schema**. Spreadsheets are terrible. They're made for structured data although it is nearly impossible to structure data in them. A spreadsheet is literally a set of cells with their own rules, no decent data validation mechanisms, no entity relationship, and schema is stored entirely in the user's mind.

* **Interface.** You can easily render graphs on HTML pages. "DB Browser for SQLite" can be used to edit data. SQL is very convenient for analytical/bulk update purposes, for scripting such as automatic data import. Queries & DDL have great readability. And you may find other ways to interact with the database.

* **Performance, limitations, control.** This project can handle millions of rows, while it's simply impossible to store them in Excel. Indexes are a basic thing for a database. The user has wider control over query execution.

* **Portability.** SQLite runs everywhere, it's fast, lightweight, free, secure and does not need an internet connection.

### What to start with?
`fin_transactions` and `balances` are core tables that need regular updates. They will lead you to other important things. For convenience, see the views.


## Schema

Below you can find the short summary with usage notes for each table. Not all details are described. For the complete structure please see [DDL](./schema.sql).


### fin_assets (table)

A **Financial asset** is something you can track a balance of. It should be fungible and tradable.

```
id          (pk)
code        (no whitespaces, uppercase, unique per type text not null)
name        (text)
description (text)
type_id     (fk fin_asset_types not null)
is_base     (boolean as integer not null) - whether this is a main unit of measurement of your portfolio. Exactly one row must have this set to true
is_active   (boolean as integer not null)
```

> Example: US Dollar. 


### fin_asset_types (table)

A **type of financial asset** (also known as an asset class) describes the nature of the financial asset.

```
id   (pk)
name (text unique not null)
```

**The number of records should not exceed 8 due to presentation reasons.**

> Examples: fiat currency, equity, bond, etc.   


### fin_storages (table)

**Financial storage** represents a place where assets are kept.

```
id          (pk)
name        (text unique not null)
description (text)
is_active   (boolean as integer not null)
```

> Examples: savings account at a specific bank or broker, virtual debt account, bag at home, cryptocurrency wallet.


### fin_assets_storages (table)

Join table. Financial storage can hold many assets, and an asset can be held in many storages. These intersections must be unique.

```
id         (pk)
asset_id   (fk fin_assets not null)
storage_id (fk fin_storages not null)
priority   (integer unique) - used for sorting balances view and possibly other things 
allocation_group_id (fk fin_allocation_groups) - shows which allocation group does a specific asset stored in specific storage belong to 
```


### fin_asset_rates (table)

Stores historical exchange rates of financial assets.

```
id       (pk)
datetime (datetime as text not null)
asset_id (fk fin_assets not null)
rate     (real not null) - [asset value] / [base asset value]
```

Generally it is desired to know the exchange rate of an asset right before transaction or balance snapshot, although one may also use deep historical data for the analytical purposes.


### fin_transactions (table)

Stores historical transactions. Transaction is an action that leads to a balance change in exactly one place.

```
id               (pk)
datetime         (datetime as text not null)
description      (text)
asset_storage_id (fk fin_assets_storages not null) - points to storage and asset that took part in the transaction
amount           (not 0, real not null) - value, direction is determined by the sign
category_id      (fk fin_transaction_categories not null)
reason_fin_asset_storage_id (fk fin_asset_storages) - see below
reason_phys_asset_ownership_id (fk phys_asset_ownerships) - see below
```

`reason_*` should point to either a financial or a physical asset owned by person if two conditions are met:
1. Transaction occurred because of that ownership
2. Title of that ownership was not directly or indirectly affected by the current transaction.

You may group transactions into batches if it is impossible to log them all.


### fin_transaction_categories (table)

A **transaction category** describes the logical sense of the transaction.

Transaction categories must form a hierarchy with only one root. If a certain flag (starting with `is_`) is set to true on a category, it must be set to true on all of its child categories. The integrity is ensured via triggers that throw exceptions.

```
id                (pk)
name              (text unique not null)
is_passive        (boolean as integer not null) - see below
is_initial_import (boolean as integer not null) - whether the transaction is an upload of the existing assets for accounting
parent_id         (fk fin_transaction_categories) - reference to a category that is a superset of the current one 
min_view_depth    (integer not null) - in the flattened representation, category shall not appear on a level lower than N, N>=0. Parent category takes multiple levels instead
```

Income or expense is considered passive if three conditions are met:

1. It occurred because of some ownership
2. Title of that ownership was not directly or indirectly affected by this transaction
3.	* For gains, there are no significant non-monetary losses associated with the transaction reason.
	* For losses, there are no significant non-monetary gains associated with the transaction reason. 

> For example, paying tax for the property a person lives in is not a passive loss, although paying it for a property that they lend is a passive loss.

Any transaction of a category with `is_passive` must link to a `reason_*` asset, because conditions for setting a `reason_` on transaction are a subset of conditions for `is_passive` of a category. Although a transaction with `reason_*` may be classified as non-passive if it implied some non-monetary gains or losses. This distinction might be useful in evaluation of the overall impact of the asset.

> Examples of transaction categories: expense, salary transfer, rent payment, self transfer or exchange, dividends payout.


### balances (table)

Stores historical balances.

Balance is a verified snapshot of the amount of some asset stored at  an asset storage at a given time. Therefore, balance entry may appear anytime without prior transaction activity, and transaction does not create an obligation to update balance right after it happened. Balance is consistent with transactions as long as transaction delta before balance `datetime` equals the balance value: `[balance] = [sum amount of txs where tx datetime < balance datetime]`. Amount can be negative.

```
id       (pk)
datetime (datetime as text not null)
amount   (real not null)
asset_storage_id (fk fin_asset_storages not null)
```

Whenever possible, balance should be queried directly from this table instead of aggregating transactions.

One may argue that storing balances separately is a bad practice because it causes denormalization and data inconsistency. However, it is a deliberate choice to store conflicting data, as such data is loaded from external sources. Purpose of this project is to offer a viewpoint on multiple versions of data in order to resolve these conflicts.


### balance_goals (table)

A **balance goal** is a plan to have a certain amount of financial assets on a balance for the specific purpose in the future, or keep it there constantly.

It is possible to have multiple goals per asset & storage - to complete all of them balance must be equal to or greater than sum of individual goals. Their progress is counted one by one according to the `priority`.

```
id               (pk)
name             (text not null)
asset_storage_id (fk fin_asset_storages not null)
amount           (real not null)
priority         (integer unique not null)
deadline         (datetime as text) - shows last desired date of completion
result_transaction_id (fk fin_transactions) - if saving resulted in a transaction, that transaction can be linked here. Such a goal will be considered complete
start_datetime   (datetime as text not null) 
end_datetime     (datetime as text) up to that moment goal is relevant, always relevant if set to null
```


### fin_allocation_groups (table)

This table sets a goal for the financial asset distribution.

Each asset & storage (`fin_assets_storages`) from your portfolio can reference a specific allocation group. Calculation should be performed based on `fin_asset_rates`.

```
id             (pk)
name           (text not null)
target_share   (real not null) - a desired fraction of all assets that the group should take, negative value means a negative target balance
start_datetime (datetime as text not null) - since that moment a rule is appled
end_datetime   (datetime as text) - up to that moment (excl) a rule is appled. Applied indefinitely if null
priority       (integer unique)
```

`target_share` may be any valid number as it shows proportion. Equal target shares indicate that values of the underlying assets should be the same. For a negative target share, there may exist a positive one with the same value that would compensate corresponding debt. Normalization should happen at a later stage.

> Examples: allocation group "CASH" should have a 5% target share in 2025 Q2


### phys_assets (table)

Represents real-world assets, purchases and other non-fungible (non-interchangeable) things. The intended use case is to track large and important assets, especially ones that generate passive gains and losses.

```
id          (pk)
name        (text unique not null)
description (text)
```

> Examples: house, apartment rented for a year, commercial property, car


### phys_asset_ownerships (table)

Tracks whether physical asset is owned by a person at a particular moment. One asset may be owned at many time periods, or not be owned at all.

```
id             (pk)
asset_id       (fk phys_asset_id not null)
start_datetime (datetime as text not null) - since that moment an asset is owned
end_datetime - (datetime as text) - up to that moment (excl) an asset is owned. Owned indefinitely if null
```

Ownership periods for the same asset must not intersect.


### swaps (table)

Provides double-entry bookkeeping for the operations where both sides are tracked. Swap is an internal transfer of value that may happen between same or different assets, possibly of different nature. Swap changes the value allocation between financial accounts or physical items. 

```
id                       (pk)
credit_fin_tx_id         (fk fin_transactions)
credit_phys_ownership_id (fk phys_asset_ownerships)
debit_fin_tx_id          (fk fin_transactions)
debit_phys_ownership_id  (fk phys_asset_ownerships)
```

Therefore, possible operations are:
- fin asset -> fin asset (exchange or transfer)
- fin asset -> phys asset (buy)
- phys asset -> fin_asset (sell)
- phys asset -> phys asset (exchange)
- phys asset -> phys asset + fin asset (exchange with change)

> Examples: transfer between bank accounts, currency exchange, buying some item 


### current_balances (view)

Shows current financial balances. Inactive assets and storages are skipped.

If you edit `balance`, a supplied one will be saved with a date of the current day. Upon the insertion of a new row, a record in `fin_assets_storages` is created if needed and the balance is upserted.

> operations: select, update, insert

```
lookup tuple:
    asset_type (text lkp fin_asset_types.name not null)
	asset_code (text lkp fin_assets.code not null)
	storage    (text lkp fin_asset_storages.name)
value:
    balance    (real not null)
info:
    pseudo_id
	asset_name
	base_balance - balance converted to base_asset
	base_asset
```


### latest_fin_transactions (view)

Shows the latest transactions. All fields except for pseudo-id are editable.

For inserts, there is also one special column `adjust_balance` - it allows to auto-update `balances` with the amount of the current transaction. In such case, a new balance entry with datetime one second after transaction will be created or updated. Works only with current datetime. **Please be cautious: while this option is convenient, used wrongly it may mess up your balance. Verify balances.**

> operations: select, update, insert, delete

```
lookup:
    pseudo_id (pk)
values:
    amount (real not null)
	datetime (datetime as text not null)
	category (text lkp fin_transaction_categories not null)
	reason_phys_asset (text lkp phys_assets via phys_asset_ownerships at datetime)
value tuples:
	1.
		asset_type (text lkp fin_asset_types.name not null)
		asset_code (text lkp fin_assets.code not null)
		storage    (text lkp fin_asset_storages.name)
	2.
		reason_asset_type (text lkp fin_asset_types.name not null)
		reason_asset_code (text lkp fin_assets.code not null)
		reason_storage    (text lkp fin_asset_storages.name)
info:
	asset_name
special:
	adjust_balance (boolean as integer, insert only) - pseudo-column, if set to true current operation will be auto-reflected in balances. Works only if transaction has datetime of the current moment
```


### historical_txs_balances_mismatch (view)

Allows to keep balances consistent with transactions for the analytical purposes.

This view contains a row only if there is a mismatch between transaction delta and balance delta during the last 2 years. It is advised to keep this view empty via adding missing transactions or adjusting balances.

> operations: select

```
info:
	start_datetime
	end_datetime
	storage
	amount_unaccounted - difference between balance and transaction delta
	tx_delta
	balance_delta
```


### current_fin_asset_rates (view)

Shows current exchange rates. Inactive assets are skipped. If you modify something, a new rate will be saved with a `datetime` of the current moment.

> operations: select, update, insert

```
lookup tuple:
    asset_type (text lkp fin_asset_types.name not null)
    asset      (text lkp fin_assets.code not null)
value:
    rate       (real not null)
info:
    pseudo_id
	base_asset
```


### current_balance_goals (view)

View that shows statuses of all goals (amount left, whether goal is accomplished, etc.). 

Goals that result in financial transactions are hidden. Goal is considered accomplished if there are enough corresponding assets on the balance. It is reversible, thus needs your attention. Accomplished goals are listed at the bottom of the view.

Goal is considered irrelevant and thus not shown if it either resulted in a transaction or the current moment is outside of the goal's datetime range. 

> operations: select

```
info:
	is_accomplished
	goal
	storage
	amount_total
	amount_left
	deadline
```


### current_fin_allocation (view)

Shows current asset allocation calculated based on your balance and exchange rates. Both current and target shares are displayed.

> operations: select

```
info:
	group
	base_balance
	base_asset
	current_share - calculated as balance / sum(abs(balance))
	target_share
```

Sum of `current_share` percentages without sign is always 100. However, negative balance leads to a negative share. Thus the real sum may vary from `-100` to `100`, where `100` means that all accounted balances are positive, `-100` means they are negative, `0` means that sum of negative balances equals sum of positive balances multiplied by `-1`.


## Schema conventions

### Structure
* Schema shall be described in pure DDL. No initial tuples are allowed.
* Each table has an `id` column as a primary key, stated as `INTEGER AUTOINCREMENT`, all foreign keys are also `INTEGER`s.
* Use strict types, but do not enable strict mode. Boolean is `INTEGER` 1 or 0, datetime is `TEXT`, numeric is `REAL` (unfortunately).
* Enforce unique in constraints, not indexes.
* Editable views have a `pseudo_id` column with unique non-null values so that client software can identify which row is being edited.

### Performance
* Expect gigabytes of data, do tricks such as force materializing.
* Do not use a view as a part of another view.
* The primary query is `SELECT`, so it makes sense to create many indexes.
* Always index foreign keys, remove indexes that are part of other covering indexes.
* Rely on internal indexes produced by primary keys and `UNIQUE` constraint

### Naming
* Snake case for identifiers, no prefixes except for indexes.
* Index name must be `i_[table_name]_[column names joined by "_"]`.
* The last noun in the table name is pluralized.
* Many-to-many (join) tables are named with a combination of tables they join. Although if referred tables have the same name prefix, it shall be used once. The table that usually has more records comes first.
> fin_assets + fin_storages => fin_assets_storages 
* Foreign key column names must end with `_id` and should point to the primary key. Their names must meet the table context and usually do not need extra database-wide prefixing.
* Names of the columns that store boolean must start with `is_`

### Common names
* `code` - string identifier that is required, unique in some way per table case-insensitively and contains no spaces. `code` is a static thing used to identify rows externally upon data edit.
* `name` - identifier for the display purposes, that can be edited anytime
* `priority` - unique `INTEGER` value for sorting and other purposes
* `is_active` - used to hide non-needed entries from the current representation, keeping them as historical data