Plain Text Accounting & Financial Reporting
Over the past year, I’ve been acting as the treasurer for a small publishing company. I converted our books from a collection of spreadsheets to a plain text accounting ledger and wrote programs to automate balance, income, and sales reporting using a tool called Beancount. These recipes may be useful to others running businesses, particularly if they have some programming talent.
Plain Text Accounting
Plain Text Accounting is an approach to accounting that has helped us minimize costs, maximize transparency, and automate reporting. Practitioners record transactions within “plain text” files that can be easily shared and stored in version control systems (such as git). Since the format of the transactions is well-defined, tools can process the files to provide auditing, translation, and reporting. For example, a program can translate a credit card statement into a plain text format to facilitate importing.
For our tooling, I chose Beancount v3, partially due to its Python API, but competitor programs ledger or hledger may work better for you. Since you control the data, there is very little “lock-in” between any of the tools although they define the ledger format differently.
Situational Context
Our business sells products through various online storefronts. We are a purely internet-based business. Individual products are developed by ad-hoc product teams who earn royalties based on their individual contributions and team agreements, so the royalty structures are unique per product.
Since we are bootstrapping the business and building revenue takes time, we try to minimize overhead cash expenditures. When we reviewed commercial accounting/bookkeeping tools, we found that they were either priced outside our budget or did not support tracking cost of goods sold and royalties sufficient to our needs. Spreadsheets were sufficient early on, but they were error-prone and, as the number of products grew, were becoming increasingly complex and fragile.
Double Entry Accounting and the Ledger
Double entry accounting is a practice where transactions are recorded (at least twice): one or more source accounts and one or more destination accounts. Since the sum of the sources and destinations must equal zero, the practice has a built-in way to verify its correctness.
All transactions can be recorded in a single ledger. For us, we have a data/ledger.txt file within our financial git repository. Access to the repository is controlled by our software forge and all changes automatically have an audit trail.
The ledger has five accounts at the top level: Assets, Liabilities, Income, Expenses, and Equity. Each ledger will define an account hierarchy under these five accounts. Using the beancount syntax, our (anonymized) accounts are:
2025-01-01 open Assets:BankName:Checking USD
2025-01-01 open Assets:StoreFront USD
2025-01-01 open Liabilities:Royalties:Member1 USD
2025-01-01 open Liabilities:Royalties:Member2 USD
2025-01-01 open Income:BankName:Interest USD
2025-01-01 open Expenses:StoreFront:Overhead USD
2025-01-01 open Expenses:Govt:TaxFees USD
2025-01-01 open Expenses:Operations USD
2025-01-01 open Expenses:UnpaidRoyalties USD ; for balancing out royalty liabilities
2025-01-01 open Equity:Member1 USD
2025-01-01 open Equity:Member2 USD
; Do not allow fake accounts to be used after initialization
2025-01-01 open Equity:Opening-Balances
2025-01-02 close Equity:Opening-Balances
; Project specific accounts
2025-01-01 open Income:StoreFront:Prod01 USD
2025-01-01 open Income:StoreFront:Prod02 USD
2025-01-01 open Expenses:Prod02 USD
In Beancount, all entries in the ledger are marked with a date. Accounts must be opened on a certain date and can be closed. If an entry is marked against a closed account, it will be reported as an error. We use the Equity:Opening-Balances account to handle initialization of values (since the business was already on-going by the time we started the ledger) and close it the day after to prevent using it accidentally.
We have two assets, a checking account at a bank and the account at our storefront since transfers from the storefront are not automatic. Unpaid royalties are liabilities and we group each member under the general Royalties category. Income comes from product sales, which we group under the store, and interest from the bank. There are a variety of sources of expenses, some general, some product specific, and we record equity per member.
As you will see later, we later added an attribute to accounts to differentiate between operational (core business related) and non-operational income and expenses. Instead of adding the attribute, we could have used a hierarchical element (e.g. Income:Op:StoreFront:Prod01 and Income:NonOp:BankName:Interest) which would have aligned better with Beancount’s query functionality.
A sales transaction demonstrates the double-entry method. A customer purchases Prod02 and pays $3.
2025-12-23 txn "StoreFront" "Prod02 Sale"
Income:StoreFront:Prod02 -3.00 USD ; Total (Gross) Sale
Expenses:StoreFront:Overhead 0.90 USD ; Storefront cut
Assets:StoreFront 2.10 USD ; Earnings
Liabilities:Royalties:Member1 -0.55 USD ; Royalty
Liabilities:Royalties:Member2 -0.05 USD ; Royalty
Expenses:UnpaidRoyalties 0.60 USD ; Royalty Float
The sum of these six changes to accounts is zero. The signs used on the values may initially seem counter-intuitive but they do make sense. The gross sales value, the Income for the sale, is recorded as a negative number. The storefront takes a portion of the sale, in this case $0.90, so that is recorded as an Expense (or cost of goods sold). (In this example, sales tax and payment processor expenses are part of the overhead.) That leaves $2.10 to be stored in our “account” with the storefront. Due to royalties, we cannot claim full ownership of that $2.10 so we need to account for our liabilities. We record two Royalty liabilities and balance their sum with an Expenses line for tracking overall UnpaidRoyalties.
When we pay out royalties, we debit the asset where we are drawing money from and credit our liabilities:
2025-01-12 txn "Company" "Member1 Royalty Payment 2024Q4"
Assets:BankName:Checking -17.00 USD ; Payment
Liabilities:Royalties:Member1 17.00 USD ; Reduction of liability
Notably, we do not change the Expenses:UnpaidRoyalties value. Assets and Liabilities have a value at given time (similar to a Gauge in telemetry), while Income and Expenses are deltas between two time values (similar to a Counter in telemetry). It makes sense to discuss a bank balance at the start or end of a quarter, but a balance is not defined over a three month period. In contrast, we can discuss sales over a quarter, but not sales at a specific point of time.
Beancount includes a tool bean-check which can verify the balancing of accounts. We call bean-check as part of our git merge pipeline. In our ledger, we include balance checks to verify our accounts match those of the banks and storefronts, e.g.:
2025-08-01 balance Assets:BankName:Checking 1670.06 USD
For populating the ledger, we have Python scripts that translate exported monthly CSV reports into the beancount format. It is also easy to write entries by hand, which is necessary because human judgment is often needed to properly categorize expenses.
Reporting using Beancount
The members of the company board expect monthly balance statements, income statements by the month, quarter, and year, and reports of sales by product. So far, I haven’t been asked to produce any custom reports, but I expect that will come.
To generate the reports, I use Python, the jinja2 templating library, and beancount’s beanquery library. On a merge to the main branch, we update a website (via Gitlab Pages) with the latest set of reports. We use Gitlab’s authentication system to control access to the website.
beanquery provides a SQL-like interface to the ledger that implements the Python database interface. Creating a connection is straight-forward:
conn: beanquery.Connection = beanquery.connect('beancount:data/ledger.txt')
I pass around a Context instance to store the various run-time objects and configuration:
@dataclass
class RenderContext:
conn: beanquery.Connection
env: jinja2.Environment
dst: pathlib.Path
earliest_ledger: datetime.date # inclusive transaction date
latest_ledger: datetime.date # inclusive transaction date
The earliest and latest ledger date values are retrieved via beanquery:
earliest = fetch_one_row(conn, 'SELECT date ORDER BY date ASC LIMIT 1')[0]
latest = fetch_one_row(conn, 'SELECT date ORDER BY date DESC LIMIT 1')[0]
Beanquery returns rich Python objects from queries, which can be unexpected, but also means you will write fewer type conversion functions.
Balance Sheets
A balance sheet reports assets, liabilities, and equity on the close of a financial period. In our case, we report monthly. The function render_balance_sheets generates a monthly report for every month in our ledger with recorded transactions. Beanquery does not support parameterized queries, so we use string interpolation via .format instead.
def render_balance_sheets(ctx: RenderContext):
template = ctx.env.get_template('balance.html')
for start_inc, end_exc in month_pair_iter(ctx.earliest_ledger, ctx.latest_ledger):
query = "SELECT account, SUM(position) FROM CLOSE ON {0} CLEAR " \
"WHERE account~'Assets|Liabilities|Equity' ORDER BY account".format(end_exc.isoformat())
out = _render_positions(ctx, template, query, (start_inc, end_exc))
ctx.dst.joinpath(f'balance-{start_inc.year}-{str(start_inc.month).rjust(2, "0")}.html') \
.write_text(out, encoding='utf-8')
The function _render_positions renders the Jinja2 template with the results from the query. We also translate the raw query results into an AccountPosition which breaks out the top-level account type as the field account_type and flatten’s beancount’s complex Inventory type into a number. (Although we are explicit with our currencies in the ledger, we only handle USD.)
AccountPosition is a dataclass defined as:
@dataclass
class AccountPosition:
account_type: str # Assets, Liabilities, Equity, Income, Expense
account: str # e.g. StoreFront, Royalties:Member1
usd: Decimal
# @property is_operational described under Income Sheets
@staticmethod
def from_tuple(record: Tuple[str, Inventory]) -> 'AccountPosition':
"""
From a bean-query balance record, return an AccountPosition
Example:
> AccountPosition.from_tuple(("'Assets:BankName:Checking'", "(1055.00 USD)"))
AccountPosition('Assets', 'BankNameChecking', Decimal('1055.00'))
:param record:
:return:
"""
account_type, account = record[0].split(':', maxsplit=1)
inventory: Inventory = record[1]
amt: Amount = inventory.get_currency_units('USD')
return AccountPosition(account_type, account, amt.number)
We excerpt the liabilities table from balance.html below. In the report, this would preceded by the assets table and followed by the equity table. For ease of reading, we exclude from the table accounts with a zero balance and represent liabilities as positive values. In the table footer, Jinja2 pipes are a good way to compute filtered sums. Since numbers are stored as Decimals, we do not need to worry about floating point errors.
<table id="liabilities" class="table-4in">
<thead>
<tr>
<th class="align-left">Liability</th>
<th class="align-right">Value ($)</th>
</tr>
</thead>
<tbody>
{% for balance in balances if balance.account_type == 'Liabilities'
and not balance.usd.is_zero() %}
<tr>
<td>{{ balance.account }}</td>
<td class="align-right currency">{{ balance.usd * -1|round(2) }}</td>
</tr>
{% endfor %}
</tbody>
<tfoot>
<tr>
<td><em>Total</em></dt>
<dt class="align-right currency">{{ balances
|selectattr('account_type', 'eq', 'Liabilities')
|sum(attribute='usd') * -1|round(2) }}</dt>
</tr>
</tfoot>
</table>
Income Sheets
An income sheet reports income and expenses within a period of time. The function render_income_sheets creates files for each month and quarter in our ledger’s horizon. Since the data represents flows rather than instances in time, the queries include both an OPEN ON and a CLOSE ON.
def render_income_sheets(ctx: RenderContext):
template = ctx.env.get_template('income.html')
for start_inc, end_exc in month_pair_iter(ctx.earliest_ledger, ctx.latest_ledger):
query = "SELECT account, SUM(position) FROM OPEN ON {0} CLOSE ON {1} " \
"WHERE account~'Income|Expenses' GROUP BY 1 ORDER BY 1" \
.format(start_inc.isoformat(), end_exc.isoformat())
out = _render_positions(ctx, template, query, (start_inc, end_exc))
ctx.dst.joinpath(f'income-{start_inc.year}-{str(start_inc.month).rjust(2, "0")}.html') \
.write_text(out, encoding='utf-8')
for quarter_inc, quarter_exc in quarter_iter(ctx.earliest_ledger, ctx.latest_ledger):
query = "SELECT account, SUM(position) FROM OPEN ON {0} CLOSE ON {1}" \
"WHERE account~'Income|Expenses' GROUP BY 1 ORDER BY 1" \
.format(quarter_inc.isoformat(), quarter_exc.isoformat())
out = _render_positions(ctx, template, query, (quarter_inc, quarter_exc))
ctx.dst.joinpath(f'income-{quarter_inc.year}-Q{str((quarter_inc.month - 1) // 3 + 1)}.html') \
.write_text(out, encoding='utf-8')
The operational income portion of the income sheet is generated via:
{% for balance in balances if balance.account_type == 'Income' and balance.is_operational %}
<tr>
<dt class="align-left indent-1">{{ balance.account }}</dt>
<dt class="align-right currency">{{ balance.usd|abs|round(2) }}</dt>
</tr>
{% endfor %}
<tr>
<dt class="align-left indent-1"><em>Total</em></dt>
<dt class="align-right currency">{{ balances
|selectattr('account_type', 'eq', 'Income')
|selectattr('is_operational')
|sum('usd')|round(2)|abs }}</dt>
</tr>
The is_operational method is coded as:
@property
def is_operational(self):
"""
:return: true if Income/Expense is part of normal business operations
"""
if self.account_type == 'Income':
return self.account != 'BankName:Checking'
elif self.account_type == 'Expenses':
return self.account != 'Govt:TaxFees'
else:
return False
Jinja2 templates are not allowed to call methods on objects, so we use @property so the template can retrieve the boolean value.
From a maintability perspective, this implementation is fragile if we define new accounts. Fortunately, we rarely define new accounts. Representing the operational nature of the account in the hierarchy would solve the problem.)
Sales by Product
For sales, we provide a table with rows for each product, columns for months within a year, and totals for each row and month.
The table is generated by the template:
<table id="gross-sales" class="table-dense">
<thead>
<tr>
<th class="align-left">Product</th>
<th>01</th>
<th>02</th>
<th>03</th>
<th>04</th>
<th>05</th>
<th>06</th>
<th>07</th>
<th>08</th>
<th>09</th>
<th>10</th>
<th>11</th>
<th>12</th>
<th>Total</th>
</tr>
</thead>
<tbody>
{% for sale in sales %}
<tr>
<dt class="align-left">{{ sale.positions[0].account }}</dt>
{% for position in sale.positions %}
{% if position.usd.is_zero() %}
<dt class="align-right"></dt>
{% else %}
<dt class="align-right">{{ position.usd|abs|round(0) }}</dt>
{% endif %}
{% end for %}
<dt class="align-right">{{ sale.positions|sum('usd')|round(0)|abs }}</dt>
</tr>
{% end for %}
</tbody>
<tfoot>
<tr>
<dt><em>Total</em></dt>
{% for month in range(1, 13) %}
<dt class="align-right">{{ (monthly_sums[month]) }}</dt>
{% end for %}
<dt class="align-right">{{ monthly_sums.values()|sum }}</dt>
</tr>
</tfoot>
</table>
We leverage beancount’s PIVOT capability:
def render_gross_sales(ctx: RenderContext):
"""
Gross sales by year and month for operational income accounts
:param ctx:
:return:
"""
template = ctx.env.get_template('sales.html')
for year in range(ctx.earliest_ledger.year, ctx.latest_ledger.year + 1):
query = "SELECT account, month(date) as d, SUM(position)" \
"WHERE account~'Income' and year(date)={0} GROUP BY 1, 2 ORDER BY 1 PIVOT BY 1, 2" \
.format(year)
now = datetime.date.today().isoformat()
incomes = fetch_all_rows(ctx.conn, query)
sales = [MonthAccountPositionPivot.from_tuple(i) for i in incomes]
sales = list(filter(lambda s: s.positions[0].is_operational, sales))
# jinja2 is ignoring a round(0) filter, so we pre-round the monthly sums before the template
monthly_sums = defaultdict(Decimal)
for sale in sales:
for i, position in enumerate(sale.positions):
monthly_sums[i+1] = monthly_sums[i+1] + position.usd
for month in range(1, 13):
monthly_sums[month] = round(abs(monthly_sums[month]), 0)
out = template.render(now=now, year=year, sales=sales, monthly_sums=monthly_sums)
ctx.dst.joinpath(f'sales-{year}.html').write_text(out, encoding='utf-8')
MonthAccountPositionPivot generates a rectangular grid of values from the query results like so:
@dataclass
class MonthAccountPositionPivot:
positions: Sequence[AccountPosition]
@staticmethod
def from_tuple(record: Tuple[str, ...]) -> 'MonthAccountPositionPivot':
positions = []
acct = record[0]
for m in range(1, len(record)):
inv = record[m]
if inv is None:
inv = ZERO_INVENTORY
positions.append(AccountPosition.from_tuple((acct, inv)))
for zero_months in range(13 - len(record)):
positions.append(AccountPosition.from_tuple((acct, ZERO_INVENTORY)))
assert len(positions) == 12, f"len(positions) must be 12, is {len(positions)}"
return MonthAccountPositionPivot(positions)
Conclusion
I’ve been pleased with adopting plain text accounting and beancount. I have more confidence that the data is correct since the ledger is verified against a history of monthly balances. Maintaining royalties is also easier because the logic is stored in one place and it is transparent how they are calculated for each transaction.