Manage Your Money is my 10th? 15th? attempt at writing a system for managing my personal finances.
Each time, I fall back to the convenience of a SaaS tool; but inevitably, a few days, weeks, or months later, I get frustrated with the SaaS tool for one reason or another:
To be fair, I didn’t try YNAB or other paid providers.
When writing my own financial tools, I’d often encounter some challenges that I wasn’t quite sure how to address from either a UX or technical perspective. These include (below I say how I resolve or avoid these):
I now have an approach - which I may not agree with in a few months, but for now I’m happy with - for each of those pain points.
Another day I’ll expound upon my many false starts.
The approach now is the following:
I had been avoiding a process like this because I had thought step 1 would be too annoying. So for a while, I would aggregate the data into Mint and export from there, but I found downloading the data from my current financial institutions is pretty easy; it was more challenging with my previous financial institutions.
The Python script has a moderately interesting design.
The data is placed into directories that’re organized by financial institutions. Each directory is loaded using a particular loader function that knows how to handle the idiosyncrasies of the format and outputs a pandas dataframe with a specific set of columns.
The dataframes are combined into one large dataframe with the following columns:
cc:cap1
for the cap1 credit card.income
, expense
, expense:reimbursable
, expense:aberration
, ignore
, or unknown
.A series of rules are applied to rows in the dataframe. The rows are easy to make because they use a Python decorator like so:
rules_registry: OrderedDict[str, Any] = OrderedDict()
def register_rule(name: str):
def decorator(func):
rules_registry[name] = func
return func
return decorator
def apply_rules(df):
for name, rule in rules_registry.items():
df = rule(df)
return df
@register_rule("identify_income")
def rule_identify_income(df):
# Set to income
substrs = ["TRINET", "GUSTO"]
for substr in substrs:
df.loc[df["description"].str.contains(substr), "type"] = "income"
return df
@register_rule("identify_rent")
def rule_identify_rent(df):
substrs = ["BRISTOL EQUITIES", "APTS LEDGER"]
for substr in substrs:
# Mask if it matches the substring and has type unknown
mask = (df["description"].str.contains(substr)) & (df["type"] == "unknown")
df.loc[mask, "type"] = "expense"
return df
After the rules are applied, the dataframe is output to a CSV. I’m manually importing the CSV into google sheets, but that’s scriptable in the future if desired.
I manually peruse and correct the transactions table in google sheets.
I set up a roll-up table where I use a series of SUMIFs to aggregate the data for each month. For example, I use the following formula to get all the expenses for a month:
=SUMIFS(Transactions!$D$1:D, Transactions!$F$1:F, "expense", Transactions!A:A, ">="&A64, Transactions!A:A, "<"&EOMONTH(A64, 0)+1)
This formula isn’t that useful out of context, but it: (a) sums up the amount column, which is Column D, (b) filters to only transactions with the type expense in column F, (d) filters to only transactions that came after the date in A64 and (d) filters to transactions that came before one month.