What is bank statement scoring?

A
Credit risk scoring · Bank statement analytics
Financial documents and bank statements being analysed

For decades, credit bureaus have been the default. A lender pulls a bureau score, checks the number, and makes a call. The system works well enough when consumers have deep credit histories and stable borrowing patterns. But a bureau score is a rear-view mirror: it tells you what happened months ago, filtered through the reporting lag of creditors who may or may not have submitted accurate data. Bank statement scoring reads the windscreen. It tells you what is happening right now, in the consumer's actual bank account, with their actual money.

SA credit providers are adopting it alongside traditional bureau enquiries, and for short-term lending, the bank statement signal increasingly carries more weight than the bureau alone.

A working definition

Bank statement scoring is the process of extracting, categorising, and scoring transaction data from a consumer's bank statements to produce a credit risk signal. It is not just OCR. It is not just data capture. It is behavioural analysis of cash flow: a structured read of how money moves through a person's account over time.

The input is a PDF bank statement (or a batch of them). The output is a structured decision pack: a behavioural credit score, an affordability calculation, a collection date forecast, and reason codes that explain the recommendation. Three months of statements become a three-dimensional picture of the consumer's financial life: income patterns, spending behaviour, existing obligations, and the timing of it all.

Where a bureau score compresses years of credit history into a single number, bank statement scoring compresses recent transactional behaviour into a decision-ready signal. The two are complementary. But for lenders dealing with thin-file applicants, irregular income, or Regulation 23A compliance, the bank statement signal often carries more weight.

The six-stage pipeline

A bank statement PDF is not data. It is an image of data, formatted for human eyes, not machine consumption. Turning it into a reliable credit signal requires six distinct stages, each adding a layer of intelligence on top of the last.

1. Intake

The statement arrives as a PDF via API or web portal upload. AffyScore accepts up to 30 statements in a single batch, typically three months each for multiple applicants, or six months for a single applicant requiring deeper history. The system identifies the bank, statement period, and account holder from the document metadata and header content before any extraction begins.

2. Extract

Transaction data is pulled from the PDF using regex-first parsing tuned to each bank's specific statement format. FNB statements look nothing like Capitec statements. Standard Bank formats differ from Discovery. Each bank has its own date formats, column layouts, reference structures, and edge cases. Regex-first extraction is faster, cheaper, and more deterministic than pure AI parsing; when a statement is a scan or a photograph rather than a digital PDF, an AI vision fallback handles the degraded input.

The output of extraction is a structured list of transactions: date, description, amount, running balance, and transaction type. Every row is mapped; nothing is discarded.

3. Tamper check

This is where most manual processes fall short. A compliance officer reviewing a bank statement by eye can spot obvious edits (misaligned text, inconsistent fonts) but they cannot check what they cannot see.

AffyScore runs four categories of tamper detection:

A tampered statement does not automatically confirm intent to defraud; the lender needs the signal regardless of intent. The tamper check surfaces anomalies with specific reason codes rather than a binary pass/fail.

4. Categorise

Raw transaction descriptions are cryptic. "CAPITEC CRD*PNP MENL 0814" is a card purchase at Pick n Pay Menlyn. "ABSA INTERAC DEBS" is a debit order from an unspecified creditor. Categorisation maps every transaction to a taxonomy: income (salary, grant, commission, rental), fixed obligations (bond, vehicle, insurance, loan repayments), variable spending (groceries, fuel, entertainment), and financial events (dishonoured debit orders, returned payments, inter-account transfers).

Accurate categorisation is what separates data capture from credit intelligence. A lender does not need to know the consumer spent R247.50 at Woolworths on 14 March. They need to know that grocery spending averages R4,200 per month and has been trending upward for three consecutive months. Categorisation turns a 90-day transaction list into an answer.

5. Affordability

With income identified, obligations mapped, and spending categorised, the system calculates a Regulation 23A-compliant affordability assessment. This is the calculation the NCR requires: gross income, less statutory deductions (tax, UIF), less existing debt obligations, less necessary living expenses, equals disposable income available for new credit.

The difference between a bank-statement-derived affordability calculation and a declared-income calculation is evidence. When the consumer declares income of R28,000 but the bank statement shows net deposits of R19,400, the lender has an immediate discrepancy to resolve: before the credit is granted, not after the first missed payment. When the consumer declares no existing loan obligations but the statement shows three active debit orders totalling R6,800 per month, the lender is looking at a completely different affordability position.

The affordability output includes gross income (validated), net income, total identified obligations, estimated necessary expenses, and a disposable income figure with a confidence rating based on data completeness.

6. Score and recommendation

The final stage synthesises everything upstream into a behavioural score on a 300-850 scale, paired with a recommendation (Clear, Caution, or Review) and reason codes that explain the decision drivers.

The score is not a bureau score replacement. It is a different signal entirely, one built on cash flow behaviour rather than credit account history. A consumer with a thin bureau file but three months of stable salary deposits, no dishonoured debit orders, and a healthy balance trajectory will score well. A consumer with an 800 bureau score but a bank statement showing chronic overdraft reliance, gambling transactions reducing disposable income, and four reversed debit orders in the past quarter will score poorly. Both signals have value. Together, they catch what neither catches alone.

What the output looks like

AffyScore delivers a decision pack, not a raw data dump. The pack contains:

The decision pack is available in three formats: JSON for API integration into lending platforms, PDF for human review and compliance filing, and XLSX for analysts who want to work with the underlying data. All three contain the same information, structured for different consumers within the credit provider's organisation.

Why now

Lenders have been reading bank statements since before credit bureaus existed. Three forces converged to make automated, structured bank statement scoring a practical necessity.

Regulatory pressure

The NCR's proposed amendments to Regulation 23A signal increased scrutiny of affordability documentation. The draft amendments signal which way enforcement is heading: more granular affordability evidence, shorter look-back windows for income verification, and explicit requirements to validate declared information against bank statement data. Credit providers who are still relying on payslips and self-declared expense lists are building on a foundation the regulator is actively eroding.

Bureau lag

Bureau scores reflect credit account data reported by creditors on monthly submission cycles, meaning a new account or delinquency may not appear until the next reporting cycle. A consumer who lost their job last month, who took on three new store accounts in the past six weeks, or who started missing debit orders this week: none of that shows up in the bureau score today. The bank statement shows all of it. For short-term lending decisions (micro-loans, BNPL, device finance), the recency of the bank statement signal matters more than the depth of the bureau history.

Credit-invisible consumers

Millions of South Africans are economically active but credit-invisible: they have bank accounts but no formal credit history. They earn income, pay rent, buy groceries, and manage their money. They have not yet borrowed from a registered credit provider. A bureau score for these consumers is either non-existent or so thin it adds little value. A bank statement score gives them a signal from day one, based on the financial behaviour they are already demonstrating.

Who uses it

The immediate market is NCR-registered credit providers who are already required to perform affordability assessments. That includes roughly 4,500 micro-lenders and approximately 2,200 debt counsellors (NCR Annual Report) who need to verify client financial positions during debt review applications.

Beyond the regulated core, bank statement scoring is being adopted by:

Banks supported

AffyScore parses statements from all six major South African banks: FNB, Standard Bank, ABSA, Nedbank, Capitec, and Discovery Bank. Each bank's statement format is handled by a dedicated regex extraction engine tuned to that bank's specific layout, date formatting, and reference number conventions.

Digital PDF statements (downloaded from internet banking) produce the highest extraction accuracy and the most reliable tamper detection. Scanned or photographed statements are processed using an AI vision fallback; the extraction still works, but tamper detection is limited because the original PDF metadata is lost in the scanning process. Where possible, lenders should request digital originals rather than scans.

How it integrates

AffyScore offers two integration paths. The REST API accepts a PDF upload and returns the decision pack via an asynchronous webhook; fire the request, get a callback when processing completes. Typical turnaround is under three seconds in testing for a single statement, longer for batch submissions of up to 30 documents. The API is designed for integration into existing lending platforms, loan origination systems, and automated decisioning workflows.

For credit providers who do not have a development team or who process lower volumes, the web portal provides the same functionality through a browser interface. Upload statements, receive decision packs, download reports. No integration required.

Both paths deliver the same output in JSON, PDF, and XLSX formats. The API adds webhook notifications and batch processing; the portal adds a visual dashboard for manual review.

What it costs

AffyScore pricing runs from R8 to R35 per extraction, depending on monthly volume. At the top tier, the cost of a bank statement extraction, affordability calculation, and behavioural score combined is comparable to a single bureau enquiry, except the output answers three questions instead of one.

There are no setup fees, no monthly minimums, and no long-term contracts. Credit providers pay per extraction and scale as their volume grows. The pricing model is designed to make bank statement scoring economically viable even for micro-lenders processing 50 applications per month.

For a detailed breakdown of volume tiers and per-unit pricing, see the pricing section on our homepage.

Frequently asked questions

What is bank statement scoring?

The process of extracting, categorising, and scoring transaction data from a consumer's bank statements to produce a credit risk signal. It turns a PDF into a structured decision pack: behavioural score, affordability calculation, collection date forecast, and reason codes.

How is it different from OCR or data capture?

OCR reads text from an image. Data capture extracts numbers into a spreadsheet. Bank statement scoring goes further: it categorises every transaction, checks for tampering, calculates affordability, and produces a behavioural credit score with recommendation and reason codes.

Which SA banks are supported?

AffyScore parses statements from all six major South African banks: FNB, Standard Bank, ABSA, Nedbank, Capitec, and Discovery Bank. Each has a dedicated extraction engine tuned to its specific format.

Does it replace the bureau?

No. Bank statement scoring supplements bureau data. The bureau reflects long-term credit history; the bank statement reflects current cash conduct. The strongest credit decision uses both.

How long does processing take?

Under three seconds for a single digital PDF statement in test conditions. Production turnaround depends on network and load. Batch submissions of multiple statements take longer.

Is it structured for Regulation 23A?

Reg 23A requires credit providers to verify income from bank statements or payslips and to document the affordability calculation. AffyScore's extraction and affordability outputs are structured to satisfy that evidence requirement, but compliance with the NCA remains the credit provider's obligation.

This article is general information for credit providers and does not constitute professional legal or financial advice. Specific regulatory requirements may vary. Always verify against current NCA legislation and NCR guidelines before acting.

Automate your bank statement extraction and affordability preparation

Book a demo to see how AffyScore works with your lending workflow.

Book a demo