This commit is contained in:
yrzam 2023-12-04 04:23:36 +04:00
commit 0a21106895
3 changed files with 688 additions and 0 deletions

2
.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
test

231
README.md Normal file
View File

@ -0,0 +1,231 @@
# FinDB - my financial data model & tools
## About
This repository contains the SQLite schema and scripts to handle personal financial data. It takes a generic approach to assets, transactions, balances and other things providing heavy automation capabilities, extensibility & deep analysis opportunities.
### Why SQLite but not spreadsheets?
* **Schema**. Spreadsheets are terrible. They're made for structured data although it is nearly impossible to structure data in them. A spreadsheet is literally a set of cells with their own rules, no decent data validation mechanisms, no relations, and schema is stored entirely in the user's mind.
* **Interface.** You can easily render graphs on HTML pages. "DB Browser for SQLite" can be used to edit data. SQL is very convenient for analytical/bulk update purposes, for scripting such as automatic data import. Queries & DDL have great readability. And you may find other ways to interact with the database.
* **Performance, limitations, control.** This project can handle millions of rows, while it's simply impossible to store them in Excel. Indexes are a basic thing for a database. The user has wider control over query execution.
* **Portability.** SQLite runs everywhere, it's fast, lightweight, free, secure and does not need an internet connection.
### What to start with?
`fin_transactions` and `balances` are core tables that need regular updates. They will lead you to other important things. For convenience, see the views.
## Schema
Below you can find the short summary with usage notes for each table. Only columns that need clarification are described. For the complete structure please see [DDL](./schema.sql).
### fin_assets (table)
A **Financial asset** is something you can track a balance of. It should be fungible and tradable.
```
is_base - defines whether this is a main unit of measure of your portfolio. Exactly one row must have this set to true
...
```
> Example: US Dollar.
### fin_asset_types (table)
A **type of financial asset** (also known as an asset class) describes the nature of the financial assets.
**The number of records should not exceed 8 due to presentation reasons.**
> Examples: fiat currency, equity, bond, etc.
### fin_storages (table)
**Financial storage** represents a place where assets are kept.
> Examples: savings account at a specific bank or broker, bag at home, cryptocurrency wallet.
### fin_assets_storages (table)
Join table. Financial storage can hold many assets, and an asset can be held in many storages.
```
allocation_group_id - shows which allocation group does a specific asset stored in specific storage belong to
priority - used for sorting balances view and possibly other things
...
```
### fin_asset_rates (table)
Stores historical exchange rates of financial assets.
```
rate = [asset_id value] / [base asset from fin_assets value] at [date]
...
```
**Rates should be up-to-date on the last day of each month.**
### current_fin_asset_rates (editable view)
Shows current exchange rates. If you edit something, a new rate will be saved with a `date` of the current day.
### fin_transactions (table)
Stores historical transactions. Transaction is an action that leads to a balance change in exactly one place.
Exchanges and self-transfers should be represented by two transactions, both having `is_rebalance` flag in the category. As for now, these two are not linked together due to the homogenous nature of the financial assets.
```
asset_storage_id - points to storage and asset that took part in in transaction, direction is determined by the sign of [amount]
amount - numeric, can be negative
reason_fin_asset_storage_id - (optional) financial asset, due to which transaction has occurred. This must be not manipulation with the asset itself, but the byproduct of its ownership
reason_phys_asset_id - (optional) physical asset, due to which transaction has occurred. This must be not manipulation with the asset itself, but the byproduct of its ownership
...
```
**Transactions should be up-to-date on the last day of each month, as they are matched with balances. Transactions can be grouped into large blocks that are consistent with the overall balance delta.**
### fin_transaction_categories (table)
A **transaction category** describes the logical sense of the transaction.
```
is_rebalance - whether the transaction is a part of self-transfer / exchange
is_passive - whether the income or expense is passive (see definition below)
...
```
Income or expense is considered passive if two conditions are met:
1. It occurred because of some ownership
2. * For gains, there are no severe non-monetary losses associated with the transaction reason.
* For losses, there are no severe non-monetary gains associated with the transaction reason.
> For example, paying tax for the property a person lives in is not a passive loss, although paying it for a property that they lend is a passive loss.
> Examples of transaction categories: salary transfer, rent payment, self transfer or exchange, dividends payout.
### balances (table)
Stores historical balances.
**Balances should be up-to-date on the last day of each month, as they are matched with transactions and rates.**
### current_balances (editable view)
Shows current exchange rates.
If you edit `balance`, a supplied one will be saved with a date of the current day. You can also insert a new row - it will insert a record into `fin_assets_astorages` if needed, and upsert balances. For updates and inserts, lookup is performed via the tuple `{asset_type, asset_code, storage}`, and `balance` is used as the value. Other columns are ignored.
### balance_goals (table)
Financial planning begins there. A **balance goal** is a plan to have a certain amount of financial assets on a balance for the specific purpose in the future, or keep it there constantly.
It is possible to have multiple goals per asset & storage - to complete all of them balance must be equal to or greater than sum of individual goals. Their progress is counted one by one according to the `priority`.
```
deadline - shows last desired date of completion
result_transaction_id - if saving resulted in the transaction, that transaction can be linked here. Such a goal will be considered complete
...
```
Cancelled goals should be deleted.
## current_balance_goals (view)
View that shows statuses of all goals (amount left, whether goal is accomplished, etc.).
Goals that result in financial transactions are hidden. Goal is considered accomplished if there are enough corresponding assets on the balance. It is reversible, thus needs your attention. Accomplished goals are listed at the bottom of the view.
### phys_assets (table)
Represents real-world assets, purchases and other non-fungible (non-interchangeable) things that a person owns. The intended use case is to track large and important assets, especially ones that generate passive gains and losses.
The asset is considered currently owned if it is neither sold (`sell_transaction_id is null`) nor naturally expired (`is_expired=0`).
> Examples: house, apartment rented for a year, commercial property, car, blockchain NFT
### fin_allocation_groups (table)
This table sets a goal for the financial asset distribution.
Each asset & storage (`fin_assets_storages`) from your portfolio can reference a specific allocation group.
```
target share - a desired fraction of all your assets that the group should take. This may be any non-negative number
...
```
This table is not historical.
> Examples: allocation group "CASH" should have a 5% target share
### current_fin_allocation (view)
Shows current asset allocation calculated based on your balance and exchange rates. Both current and target shares are displayed.
### historic_monthly_balances (view)
Shows monthly balance with source, calculates deltas - total and grouped by asset type.
Data is calculated over 10 years with a period of 1 month for the last day of that month.
```
base_balance - total balance, converted to the base asset
base_balance_delta - balance change since the previous month
base_active_delta - delta (gains - losses) for all transactions that are not passive income/expenses and occurred during this month
base_passive_delta - balance change caused by exchange rate fluctuations, passive income/expenses, non-specified transactions
*_by_type - same data but per asset type
...
```
## Schema conventions
### Structure
* Schema shall be described in pure DDL. No initial tuples are allowed.
* Each table has an `id` column as a primary key, stated as `INTEGER AUTOINCREMENT`.
* Boolean is `INTEGER` 1 or 0, date is `NUMERIC`.
* Enforce unique in constraints, not indexes.
* Editable views have a `pseudo_id` column with unique non-null values so that client software can identify which row is being edited.
### Performance
* Expect gigabytes of data, do tricks such as force materializing.
* Do not use a view as a part of another view.
* The primary query is `SELECT`, so it makes sense to create many indexes.
* Always index foreign keys, remove indexes that are part of other covering indexes.
* Rely on internal indexes produced by primary keys and `UNIQUE` constraint
### Naming
* Snake case for identifiers, no prefixes except for indexes.
* Index name must be `i_[table_name]_[column names joined by "_"]`.
* The last noun in the table name is pluralized.
* Many-to-many (join) tables are named with a combination of tables they join. Although if referred tables have the same name prefix, it shall be used once. The table that usually has more records comes first.
> fin_assets + fin_storages => fin_assets_storages
* Foreign key column names must end with `_id` and should point to the primary key. Their names must meet the table context and usually do not need extra database-wide prefixing.
* Names of the columns that store boolean must start with `is_`
### Common names
* `code` - string identifier that is required, unique in some way per table and contains no spaces. `code` is a static thing used to identify rows upon data edit.
* `name` - identifier just for the display purposes, that can be edited anytime
* `priority` - unique `INTEGER` value for sorting and other purposes
* `is_active` - used to hide non-needed entries from the current representation, keeping them as historical data

455
schema.sql Normal file
View File

@ -0,0 +1,455 @@
CREATE TABLE IF NOT EXISTS "fin_storages" (
"id" INTEGER NOT NULL,
"name" TEXT NOT NULL UNIQUE,
"is_active" INTEGER NOT NULL DEFAULT 1,
PRIMARY KEY("id")
);
CREATE TABLE IF NOT EXISTS "fin_asset_types" (
"id" INTEGER NOT NULL,
"name" TEXT NOT NULL,
PRIMARY KEY("id")
);
CREATE TABLE IF NOT EXISTS "fin_asset_rates" (
"id" INTEGER NOT NULL,
"date" NUMERIC NOT NULL,
"asset_id" INTEGER NOT NULL,
"rate" NUMERIC NOT NULL,
UNIQUE("date","asset_id"),
PRIMARY KEY("id"),
FOREIGN KEY("asset_id") REFERENCES "fin_assets"("id")
);
CREATE TABLE IF NOT EXISTS "balances" (
"id" INTEGER NOT NULL,
"date" NUMERIC NOT NULL,
"asset_storage_id" INTEGER NOT NULL,
"amount" NUMERIC NOT NULL,
UNIQUE("date","asset_storage_id"),
FOREIGN KEY("asset_storage_id") REFERENCES "fin_assets_storages"("id"),
PRIMARY KEY("id")
);
CREATE TABLE IF NOT EXISTS "fin_allocation_groups" (
"id" INTEGER NOT NULL,
"name" TEXT NOT NULL UNIQUE,
"target_share" NUMERIC NOT NULL DEFAULT 0, priority int,
PRIMARY KEY("id")
);
CREATE INDEX "i_fin_asset_rates_asset_id_date" ON "fin_asset_rates" (
"asset_id",
"date" DESC
);
CREATE INDEX "i_balances_asset_storage_id_date" ON "balances" (
"asset_storage_id",
"date" DESC
);
CREATE TABLE IF NOT EXISTS "fin_assets_storages" (
"id" INTEGER NOT NULL,
"asset_id" INTEGER NOT NULL,
"storage_id" INTEGER NOT NULL,
"priority" INTEGER UNIQUE,
"allocation_group_id" INTEGER,
FOREIGN KEY("asset_id") REFERENCES "fin_assets"("id"),
FOREIGN KEY("allocation_group_id") REFERENCES "fin_allocation_groups"("id"),
FOREIGN KEY("storage_id") REFERENCES "fin_storages"("id"),
PRIMARY KEY("id"),
UNIQUE("asset_id","storage_id")
);
CREATE INDEX "i_fin_assets_storages_asset_id" ON "fin_assets_storages" (
"asset_id"
);
CREATE INDEX "i_fin_assets_storages_storage_id" ON "fin_assets_storages" (
"storage_id"
);
CREATE INDEX "i_fin_assets_storages_allocation_group_id" ON "fin_assets_storages" (
"allocation_group_id"
);
CREATE TABLE IF NOT EXISTS "fin_assets" (
"id" INTEGER NOT NULL,
"code" TEXT NOT NULL,
"name" TEXT,
"type_id" INTEGER NOT NULL,
"is_base" INTEGER NOT NULL DEFAULT 0,
"is_active" INTEGER NOT NULL DEFAULT 1,
UNIQUE("type_id","code"),
PRIMARY KEY("id"),
FOREIGN KEY("type_id") REFERENCES "fin_asset_types"("id")
);
CREATE INDEX "i_fin_assets_is_base" ON "fin_assets" (
"is_base"
);
CREATE INDEX "i_fin_assets_type_id" ON "fin_assets" (
"type_id"
);
CREATE VIEW "current_fin_asset_rates" as
select
fa.id as pseudo_id,
fat.name as asset_type,
fa.code as asset,
(select rate from fin_asset_rates far where far.asset_id=fa.id order by far.date desc limit 1) as rate,
(select fa2.code from fin_assets fa2 where fa2.is_base=1 limit 1) as base_asset
from
fin_assets fa
join fin_asset_types fat on fat.id=fa.type_id
where
fa.is_active=1
/* current_fin_asset_rates(pseudo_id,asset_type,asset,rate,base_asset) */;
CREATE TRIGGER current_fin_asset_rates_update
instead of update of rate on current_fin_asset_rates
begin
insert into fin_asset_rates (asset_id, date, rate)
values(
(
select
fa.id
from
fin_assets fa
join fin_asset_types fat on fat.id=fa.type_id
where
fa.code=new.asset and
fat.name=new.asset_type
),
date('now'),
new.rate
)
on conflict(asset_id,date) do update
set rate=new.rate;
end;
CREATE VIEW current_fin_allocation as
select
t.name as "group",
round(t.base_balance,2) as base_balance,
(select code from fin_assets fa2 where fa2.is_base=1 limit 1) as base_asset,
(round(t.current_share, 1) || '%') as current_share,
(round(t.target_share, 1) || '%') as target_share
from (
select
t.name,
t.base_balance,
t.base_balance*100 / sum(t.base_balance) over() as current_share,
t.target_share,
1 as sort1,
t.priority
from
(
select
fag.name,
fag.priority,
coalesce(
sum(
(select b.amount from balances b where b.asset_storage_id=fas.id order by b.date desc limit 1)*
(select far.rate from fin_asset_rates far where far.asset_id=fa.id order by far.date desc limit 1)
),
0) as base_balance,
fag.target_share * 100 / sum(fag.target_share) over () as target_share
from
fin_allocation_groups fag
left join fin_assets_storages fas on fas.allocation_group_id=fag.id
left join fin_assets fa on fa.id=fas.asset_id
left join fin_storages fs on fs.id=fas.storage_id
where
fas.id is null or
(fa.is_active=1 and fs.is_active=1)
group by
fag.id
) t
union all
select
'Others' as name,
sum(
(select b.amount from balances b where b.asset_storage_id=fas.id order by b.date desc limit 1)*
(select far.rate from fin_asset_rates far where far.asset_id=fa.id order by far.date desc limit 1)
) as base_balance,
null as current_share,
null as target_share,
0 as sort1,
null as priority
from
fin_assets_storages fas
join fin_assets fa on fa.id=fas.asset_id
join fin_storages fs on fs.id=fas.storage_id
where
fas.allocation_group_id is null and
fa.is_active=1 and
fs.is_active=1
) t
where
t.base_balance!=0 or t.target_share!=0
order by
sort1 desc, priority desc
/* current_fin_allocation("group",base_balance,base_asset,current_share,target_share) */;
CREATE VIEW "current_balances" AS select
*
from (
select
fas.id as pseudo_id,
fat.name as asset_type,
coalesce(fa.code,fa.name) as asset_name,
fs.name as storage,
(select b.amount from balances b where b.asset_storage_id=fas.id order by b.date desc limit 1) as balance,
fa.code as asset_code,
round(
(select b.amount from balances b where b.asset_storage_id=fas.id order by b.date desc limit 1)*
(select far.rate from fin_asset_rates far where far.asset_id=fa.id order by far.date desc limit 1),
2) as base_balance,
(select code from fin_assets fa2 where fa2.is_base=1 limit 1) as base_asset
from
fin_assets_storages fas
join fin_assets fa on fa.id=fas.asset_id
join fin_asset_types fat on fat.id=fa.type_id
join fin_storages fs on fs.id=fas.storage_id
where
fa.is_active=1 and
fs.is_active=1
order by
fas.priority desc
) t
where
t.balance is not null
/* current_balances(pseudo_id,asset_type,asset_name,storage,balance,asset_code,base_balance,base_asset) */;
CREATE TRIGGER current_balances_insert
instead of insert on current_balances
begin
insert into fin_assets_storages(asset_id, storage_id)
values(
(select fa.id from fin_assets fa join fin_asset_types fat on fat.id=fa.type_id where fa.code=new.asset_code and fat.name=new.asset_type),
(select id from fin_storages where name=new.storage)
)
on conflict do nothing;
insert into balances(asset_storage_id, date, amount)
values(
(
select
fas.id
from
fin_assets_storages fas
join fin_assets fa on fa.id=fas.asset_id
join fin_asset_types fat on fat.id=fa.type_id
join fin_storages fs on fs.id=fas.storage_id
where
fa.code = new.asset_code and
fat.name = new.asset_type and
fs.name = new.storage
),
date('now'),
new.balance
)
on conflict(asset_storage_id, date) do update
set
amount = new.balance;
end;
CREATE TRIGGER current_balances_update
instead of update of balance on current_balances
begin
insert into balances(asset_storage_id, date, amount)
values(
(
select
fas.id
from
fin_assets_storages fas
join fin_assets fa on fa.id=fas.asset_id
join fin_asset_types fat on fat.id=fa.type_id
join fin_storages fs on fs.id=fas.storage_id
where
fa.code = new.asset_code and
fat.name = new.asset_type and
fs.name = new.storage
),
date('now'),
new.balance
)
on conflict(asset_storage_id, date) do update
set
amount = new.balance;
end;
CREATE TABLE IF NOT EXISTS "fin_transactions" (
"id" INTEGER NOT NULL,
"date" NUMERIC NOT NULL,
"asset_storage_id" INTEGER NOT NULL,
"amount" NUMERIC NOT NULL,
"category_id" INTEGER NOT NULL,
"reason_fin_asset_storage_id" INTEGER,
"reason_phys_asset_id" INTEGER,
FOREIGN KEY("reason_fin_asset_storage_id") REFERENCES "fin_assets_storages"("id"),
FOREIGN KEY("category_id") REFERENCES "fin_transaction_categories"("id"),
PRIMARY KEY("id"),
FOREIGN KEY("reason_phys_asset_id") REFERENCES "phys_assets"("id")
);
CREATE INDEX "i_fin_transactions_asset_storage_id_date" ON "fin_transactions" (
"asset_storage_id",
"date" DESC
);
CREATE INDEX "i_fin_transactions_result_fin_asset_storage_id" ON "fin_transactions" (
"reason_fin_asset_storage_id"
);
CREATE INDEX "i_fin_transactions_category_id" ON "fin_transactions" (
"category_id"
);
CREATE TABLE IF NOT EXISTS "phys_assets" (
"id" INTEGER NOT NULL,
"name" TEXT NOT NULL UNIQUE,
"buy_transaction_id" INTEGER,
"sell_transaction_id" INTEGER,
"is_expired" INTEGER NOT NULL DEFAULT 0,
FOREIGN KEY("buy_transaction_id") REFERENCES "fin_transactions"("id"),
FOREIGN KEY("sell_transaction_id") REFERENCES "fin_transactions"("id"),
PRIMARY KEY("id")
);
CREATE TABLE IF NOT EXISTS "balance_goals" (
"id" INTEGER NOT NULL,
"name" TEXT NOT NULL,
"asset_storage_id" INTEGER NOT NULL,
"amount" NUMERIC NOT NULL,
"priority" INTEGER NOT NULL UNIQUE,
"deadline" NUMERIC,
"result_transaction_id" INTEGER,
FOREIGN KEY("asset_storage_id") REFERENCES "fin_assets_storages"("id"),
PRIMARY KEY("id"),
FOREIGN KEY("result_transaction_id") REFERENCES "fin_transactions"("id")
);
CREATE INDEX "i_balance_goals_asset_storage_id" ON "balance_goals" (
"asset_storage_id"
);
CREATE INDEX "i_balance_goals_result_phys_asset_id" ON "balance_goals" (
"result_transaction_id"
);
CREATE INDEX "i_fin_transactions_result_phys_asset_id" ON "balance_goals" (
"result_transaction_id"
);
CREATE VIEW "current_balance_goals" AS select
t.amount_left=0 as is_accomplished,
t.goal,
t.storage,
t.amount_total,
t.amount_left,
case when t.amount_left!=0 then t.deadline end as deadline
from (
select
bg.name as goal,
coalesce(fa.name,fa.code)||' - '||fs.name as storage,
bg.amount as amount_total,
bg.priority,
bg.deadline,
min(max(
sum(bg.amount) over(
partition by
bg.asset_storage_id
order by
bg.priority desc
rows between
unbounded preceding and current row
) - coalesce((
select
amount
from
balances b
where
b.asset_storage_id=bg.asset_storage_id
order by
b.date desc
limit 1
),0),
0
), bg.amount) as amount_left
from
balance_goals bg
join fin_assets_storages fas on fas.id=bg.asset_storage_id
join fin_assets fa on fa.id=fas.asset_id
join fin_storages fs on fs.id=fas.storage_id
where
bg.result_transaction_id is null
) t
order by
is_accomplished asc,
t.priority desc
/* current_balance_goals(is_accomplished,goal,storage,amount_total,amount_left,deadline) */;
CREATE TABLE IF NOT EXISTS "fin_transaction_categories" (
"id" INTEGER NOT NULL,
"name" TEXT NOT NULL UNIQUE,
"is_rebalance" INTEGER NOT NULL DEFAULT 0,
"is_passive" INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY("id")
);
CREATE VIEW historical_monthly_balances as
with recursive dates(mo_start_date, date) as (
values(
date('now', 'start of month'),
date('now', 'start of month', '+1 month', '-1 day')
)
union all
select
date(mo_start_date, '-1 month'),
date(mo_start_date, '-1 day')
from
dates
where
mo_start_date > (select min(date) from balances)
limit 10*12 -- use yearly view for periods over 10 years
),
constants as(
select
(select code from fin_assets fa2 where fa2.is_base=1 limit 1) as base_asset
),
data_by_type as materialized( -- force materialize - 2x faster due to lead()
select
d.date,
fat.id as asset_type_id,
(
select
sum(
(select b.amount from balances b where b.asset_storage_id=fas.id and b.date<=d.date order by b.date desc limit 1)*
(select far.rate from fin_asset_rates far where far.asset_id=fas.asset_id and far.date<=d.date order by far.date desc limit 1)
)
from
fin_assets fa
join fin_assets_storages fas on fas.asset_id=fa.id
where
fa.type_id=fat.id
) as base_balance,
(
select
coalesce(sum(
(select sum(ft.amount) from fin_transactions ft join fin_transaction_categories ftc on ftc.id=ft.category_id where ft.asset_storage_id=fas.id and ft.date>=d.mo_start_date and ft.date<=d.date and ftc.is_rebalance=0 and ftc.is_passive=0)*
(select far.rate from fin_asset_rates far where far.asset_id=fas.asset_id and far.date<=d.date order by far.date desc limit 1)
),0)
from
fin_assets fa
join fin_assets_storages fas on fas.asset_id=fa.id
where
fa.type_id=fat.id
) as base_active_delta
from
dates d
cross join fin_asset_types fat
)
select
t.date,
round(sum(t.base_balance),2) as base_balance,
round(sum(t.base_balance_delta),2) as base_balance_delta,
round(sum(t.base_active_delta),2) as base_active_delta,
round(sum(t.base_balance_delta)-sum(t.base_active_delta),2) as base_passive_delta,
(select base_asset from constants) as base_asset,
group_concat(fat.name||'='||cast(t.base_balance as integer) || ' ' || (select base_asset from constants), ' ') as base_balance_by_type,
group_concat(fat.name||'='||cast(t.base_balance_delta as integer) || ' ' || (select base_asset from constants), ' ') as base_balance_delta_by_type,
group_concat(fat.name||'='||cast(t.base_active_delta as integer) || ' ' || (select base_asset from constants), ' ') as base_active_delta_by_type,
group_concat(fat.name||'='||cast(t.base_balance_delta-t.base_active_delta as integer) || ' ' || (select base_asset from constants), ' ') as base_passive_delta_by_type
from
(
select
t.*,
t.base_balance-lead(t.base_balance) over(partition by t.asset_type_id order by date desc) as base_balance_delta
from
data_by_type t
where
t.base_balance is not null
) t
left join fin_asset_types fat on fat.id=t.asset_type_id
group by
1
order by
t.date desc
/* historical_monthly_balances(date,base_balance,base_balance_delta,base_active_delta,base_passive_delta,base_asset,base_balance_by_type,base_balance_delta_by_type,base_active_delta_by_type,base_passive_delta_by_type) */;