Sunday, May 10, 2026

Reading the CURRENT — LevelDB Forensics with crush

Moin! 👋

LevelDB is everywhere — you just don't see it. If your workflow starts and ends with SQLite, you're walking past a surprising amount of data on every acquisition. Chrome, Electron-based desktop apps, iOS, Android — LevelDB shows up constantly in real cases, often unnoticed, and often holding data that other tools quietly skip over.

With the current release, the crush LevelDB viewer has grown significantly. This post walks through what it can do — and more importantly, what it means for your investigation.

What Is LevelDB — and Why Should I Care?

LevelDB is an on-disk key-value store developed by Google. Unlike SQLite, it does not present itself as a single file with a recognisable header — instead, a LevelDB database occupies an entire folder, containing a mix of files with .log, .ldb, and metadata files named MANIFEST-######, CURRENT, and LOG. If you have ever opened an acquisition and seen a folder full of files like that and moved on, there is a good chance you left data on the table.

The forensic value of LevelDB depends on what the application chose to store there — and that varies enormously. In Chrome and Chromium-based browsers you will find browsing history fragments, sync metadata, IndexedDB entries from web applications, session state, and local storage data. Electron apps like Signal Desktop, Discord, or WhatsApp Desktop use LevelDB for app configuration, cached user data, and UI state.

On top of that, there is a structural property worth understanding: deleted records are not immediately gone. Every write to LevelDB — including deletions — is appended as a new entry with a sequence number. A deletion creates a "tombstone" entry: the key remains, but the value is empty. In .ldb files, superseded or deleted records are eventually cleaned out during compaction — but until that happens, they can still be read. In .log files, the active write log, everything is there. A tombstone without a value is not nothing: the key alone can tell you that something existed and was deliberately removed.

Alex Caithness from CCL wrote the definitive primer on this in 2020 — if you have not read it yet, it is required reading before going further:

The open-source Python library that came out of that research — ccl_chrome_indexeddb — is also the foundation that the crush LevelDB viewer builds on. Full credit to Alex and the CCL team for that work.

Where to Find LevelDB

LevelDB shows up more often than most people expect — and the pattern is consistent enough that you can learn to spot it without memorising paths.

On iOS and Android, look for any Chromium-based browser (Chrome, Edge, Brave) and any app built on a Chromium WebView or Electron runtime. The LevelDB stores typically live under the app's data container, in subdirectories named IndexedDB, Local Storage/leveldb, or Sync Data/LevelDB. The folder name is usually your first hint.

On desktop systems — Windows, Linux, macOS — the same logic applies, but the surface area is larger than most people assume. Every Chromium-based browser and every Electron app on the system has its own LevelDB stores. On Linux alone, a standard workstation running Chrome, Signal Desktop, Discord, and Slack will have dozens of LevelDB databases under ~/.config/. It is easy to treat LevelDB as a mobile-only concern. It is not.

The reliable indicator is always the folder structure: a directory containing a CURRENT file, one or more MANIFEST-###### files, and a mix of .log and .ldb or .sst files is a LevelDB database — regardless of what the folder itself is named.

Example LevelDB folder content


For this post, I am using the Spotify Client App from an iPhone acquisition (Public iOS 17 Image from Josh Hickman) as the example throughout. The path to the LevelDB looks like this:

/private/var/mobile/Containers/Data/Application/E81161FB-2689-4DF6-B98B-C8F95B056440/Library/Application Support/PersistentCache/Users/7jlf2hrgerxu6n0pp8l35r5h0-user/primary.ldb/

Opening a LevelDB Database in crush

LevelDB databases don't announce themselves — there is no single file to click on. A LevelDB database is a plain directory, typically named LevelDB (or sometimes leveldb), containing a mix of .log, .ldb, .sst, and metadata files. There is no special label or icon that distinguishes it from any other folder in the file panel. The .sst (Sorted String Table) extension is functionally identical to .ldb — it is an older naming convention still used by some non-Chrome LevelDB implementations. crush handles both.

To open it in crush, right-click the folder in the file panel and select Open. A single left-click only selects the folder — the right-click menu is what triggers crush to read the directory structure and open the LevelDB viewer. That is the step that is easy to miss the first time.

No extraction needed — crush reads directly from inside ZIP or TAR archives, or from a folder on disk. 

File panel showing right-click context menu on a LevelDB folder, with


The Overview Tab

The first tab is the Overview. Before looking at any records, this is where I orient myself.

It shows the MANIFEST metadata for the database: the comparator in use, the last sequence number, the log number, and the files that make up the database grouped by compaction level. crush parses all MANIFEST-* files present in the directory — not just the current one. The active MANIFEST is labelled (current); older ones expose compaction history from before the last recovery, potentially referencing file numbers that are no longer on disk. The CURRENT pointer itself is shown as a separate entry, and where present, the prev_log_number field identifies the WAL log file that preceded a recovery — useful context when you are trying to reconstruct the write history of a database that has been through a crash or a clean shutdown cycle.

The last sequence number tells you how many write operations have been performed on this database across its lifetime — including updates and deletions, not just inserts. A low number (like 0 or close to it) means the database is fresh or barely touched. A high number is a signal that there has been a lot of activity — and potentially a lot of deleted data worth looking at.

The compaction level of the files tells you how mature the database is. Fresh data lives in .log files and level-0 .ldb files; as the database ages and compaction runs, data moves to higher levels and old or deleted records get cleaned out. A database where everything is still at level 0 has had very little compaction — which is good for us, because deleted records are more likely to still be present.


Overview tab showing MANIFEST metadata, all MANIFEST files, and file list by level


The Files Tab

The Files tab gives you a per-file breakdown: filename, type (Log or Ldb), compaction level, size in bytes, and counts of live, deleted, and unknown records per file. For .ldb and .sst table files, the tab also shows the Smallest Key and Largest Key boundaries sourced directly from the MANIFEST VersionEdit entries — decoded as UTF-8 where possible, hex otherwise. At a glance, you can see which files cover which key ranges without opening a single record.

Files containing deleted records are colour-coded red. That is your first triage signal — if a file is red, it is worth a closer look. A file with a high deleted record count relative to its live records tells its own story.

The distinction between Live, Deleted, and Unknown is worth understanding:

  • Live — the record is the current version of a key and has a value.
  • Deleted — a tombstone entry. The key is present, the value is empty. Something was here.
  • Unknown — records in .ldb files where the state cannot be determined without reading the full compaction context. These are worth inspecting manually.
Files tab showing per-file record counts, size, key ranges, with red-coded files containing deleted records

The Records Tab — Live and Deleted in Context

The Records tab shows all records from all files in a single table. This is where the examination happens.

The filter toolbar lets you narrow down to All / Live / Deleted / Unknown records. Deleted records appear inline in red, directly alongside the live data — not in a separate view. That matters: seeing what is gone in the context of what is still there is often where the interesting questions start.

Each row shows:

  • The sequence number — the order in which this record was written. Lower sequence numbers are older writes.
  • The offset — the byte position within the source file where this record begins, displayed in hex. This is your anchor for cross-referencing in a hex editor and for court testimony: if you need to point to exactly where in the evidence file a record lives, this is how you do it. The offset is included in CSV exports.
  • A UTF-8 text preview and a hex preview of the key.
  • A UTF-8 text preview and a hex preview of the value.
  • The state — Live, Deleted, or Unknown.

Selecting any row feeds the full key and value bytes into a split hex pane below the table — separate Key and Value tabs, so you can inspect each independently without them running together. For records from .ldb or .sst table files, a third Internal Key tab is also available, showing the complete LevelDB internal key: the user key plus its 8-byte sequence number and type suffix. This is the raw on-disk representation — useful when you need to verify exactly what LevelDB stored, or when the type byte tells you something about how the record was written.

Records tab showing mixed live and deleted records, with the split hex pane below


Search and Filter

The Search box filters rows case-insensitively across all columns — key text, value text, state. It combines with the state filter buttons, so you can ask questions like "show me only deleted records whose key contains this prefix" in two clicks. For databases with thousands of records, this is how you find signal in the noise quickly.

The LOG Files

LevelDB writes its own operational log to files named LOG and LOG.old — plain text, not to be confused with the binary .log write-ahead log files. crush surfaces these in dedicated tabs, one per file that exists, with the full content displayed in a monospace view and a Find toolbar for searching within the log. This is where LevelDB records its own compaction events, file opens, and error conditions — useful background when you are trying to understand what happened to a database and when.

BLOB Inspector — When the Value Is Protobuf

In the LevelDB used as the example here, roughly 95% of the values are Protobuf-encoded binary data. That is typical for LevelDBs — the apps use Protobuf extensively for serialising entity data before it goes into the LevelDB store.

Right-clicking any row in the Records table offers Inspect Key…, Inspect Value…, and for table-file records, Inspect Internal Key… — all opening the shared BLOB Inspector dialog.

In Auto mode, the inspector tries to detect the format automatically. For valid Protobuf data it will decode it and display the wire-decoded fields in protoc --decode_raw style — field numbers, wire types, and values, rendered as formatted text. PNG, JPEG, GIF, binary plists, XML, Android Binary XML, and JSON are also detected automatically in Auto mode via magic bytes or content sniffing.


BLOB Inspector open on a Protobuf value, showing decoded fields


If auto-detection does not find a match, you can manually select from the available decode modes: Hex, UTF-8, Latin-1, Base64, Plist, XML, JSON, Protobuf (schema-less), Android Binary XML, or Image. This covers the formats you are most likely to encounter in LevelDB values across Chrome, Electron apps, and Android databases.

The inspector opens as a non-modal window — the rest of the UI stays fully interactive while it is open, and you can have multiple inspectors open at the same time. That matters when you are comparing values across records without constantly opening and closing the dialog.

The BLOB Inspector is shared across all viewers in crush — SQLite, LevelDB, Realm. Any improvement to the inspector (a new decode mode, a UI fix) benefits all three at once. If you have been following the SQLite and Realm deep dives, the workflow here is identical: right-click, inspect, decode.

Further Reading

Happy examining. 🐢

Tuesday, May 5, 2026

Object by Object — RealmDB Forensics with crush

Moin! 👋

After the SQLite deep dive with crush, the next stop is Realm. This post walks through what the crush Realm viewer can do — and as before, what the features actually mean for an investigation. I am using the Android app de.formel1 — the German Formula 1 news app — as the example throughout. I enjoy watching F1, and it turns out the app stores some of its data in its Realm database.

With v0.7.0, the crush Realm viewer has been significantly extended and improved. This post walks through what it can do — and as before, what the features actually mean for an investigation.

What is Realm — and Why Should I Care?

Realm is an object-oriented database developed for mobile platforms. Unlike SQLite which stores data in tables with rows and columns, Realm stores data as objects — think classes with properties rather than spreadsheet rows. It was originally developed by Realm Inc., acquired by MongoDB in 2019, and has seen adoption in a range of Android and iOS apps.

Forensically, Realm matters because it shows up in real cases — and most tools either support it poorly or not at all. If you have never encountered a .realm file in an acquisition, you may simply not have noticed: the file has no extension-based prominence and the format is far less documented than SQLite.

For a thorough introduction to Realm and its physical structure, Damien Attoe's The Realm Files series is the best starting point available:

Finding the Database

Realm databases use the .realm extension and are typically found under the app's data directory. For the Formula 1 news app it is:

/data/data/de.formel1/files/

The extension alone is not always reliable — some apps store Realm files without it or under an unexpected name. A more robust identifier is the magic bytes: at offset 16 in the file, Realm stores the 4-byte value 54 2D 44 42 — the ASCII string "T-DB". If you see that signature, you are looking at a Realm database regardless of the filename.

In crush, .realm files are now labelled Realm in the VFS tree panel — the same way SQLite files show their type at a glance. Previously they showed no label at all, which made them easy to overlook when browsing a large acquisition.

The Header

The Header tab is my first stop when opening a Realm file — before looking at any data. It shows the fundamental structural information of the database: the file format version, the positions of both top-level references in the file, and which one is currently active.

This matters because everything else in the viewer builds on this foundation. The active Top Ref determines which snapshot is the current state of the database; the inactive one points to the previous commit. Knowing where both live in the file is the starting point for understanding what the database currently contains — and what it contained before.


The Schema Tab

The Schema tab shows every table (class in Realm terminology) as an expandable node listing all column names with their Realm type: int, bool, string, date, link, and so on.

This is important context before looking at data. Which tables exist? What properties do they have? Are there link columns that reference other tables — and can I join across them? Tables without decoded data still appear as leaf entries so nothing is hidden.


In earlier versions of the crush Realm parser, every column was labelled col_0, col_1, … regardless of the actual schema names. That is now fixed — column names are read directly from the Realm spec node.

Top Refs — Deleted Data and Schema Diff

Before diving into tables, the Top Refs tab is worth checking first. This is specific to Realm's internal architecture and has direct forensic relevance.

Realm maintains two top-level references — Top Ref 0 and Top Ref 1 — as part of its copy-on-write architecture. At any point, one of them is active (the current state of the database) and the other represents the previous commit. The Top Refs tab shows both references, which one is currently active, and key settings and metadata for each.

Beyond the low-level header comparison, the tab now also shows a Diff — schema section. This is where it gets forensically useful: crush compares the schema of both Top Refs and lists tables that exist only in the active ref (added since the last snapshot), tables that exist only in the inactive ref (deleted since the last snapshot), and tables present in both but with a changed row count. A drop in row count between the inactive and active ref is a direct indicator that records were deleted between the two commits.


Freed Data — Inspecting Unallocated Space

In addition to active and previous snapshots, crush also exposes freed data directly. Unlike the Tables tab, this is not reconstructed into rows — it is a view into unallocated regions of the Realm file.

When objects are deleted in Realm, their space is marked as free but not immediately overwritten. These regions may still contain residual data from previously stored objects. The Freed Data tab surfaces these regions as individual entries with:

  • Offset — where the free region is located in the file
  • Size — the length of the region
  • Source — whether the region originates from the active or inactive Top Ref
  • Content — decoded where possible (currently it looks for null terminated strings)

Selecting an entry reveals the raw data, including a full hex view at the bottom of the tab. This keeps the representation close to the underlying file structure and avoids introducing assumptions about object boundaries or schema correctness.

In practice, these regions can still contain recognizable artefacts such as strings, timestamps, or fragments of previously stored objects. However, interpretation requires care: data may be partially overwritten, misaligned, or no longer internally consistent.

What this view provides is access to data that is not present in either the active or previous snapshot — and would otherwise remain invisible.


The Tables Tab — Active and Previous Snapshot

The Tables tab shows decoded table data from both Top Refs — not just the active one. This mirrors how the SQLite WAL viewer exposes superseded page versions: the idea is the same, Realm just implements it differently at the format level.

Tables from the inactive (previous) ref can be shown by activating the Show diff to prev ref checkbox. In the temporary SQLite database (more on this in the next section) the previous refs are stored as _prev_<tablename>, which makes cross-snapshot comparisons straightforward. The example data in the image is synthetically generated — I do not have real test data for this scenario in the Formula 1 app.


Deleted = Only existent in the prev ref
Prev Version = In both refs - but different content
Added = Only in the active ref
Everything else = identical in both refs

A few things that are now decoded correctly that were previously broken:

  • Timestamps — Realm timestamp columns (type 8) were previously fed through the wrong decoder, producing incorrect Unix-epoch-1970 dates. They are now decoded correctly and displayed as YYYY-MM-DD HH:MM:SS UTC.
  • Strings — format 24 stores strings in fixed-width inline entries. The previous logic produced garbage values; strings now decode correctly.
  • Booleans — nullable booleans (Realm's 2-bit encoding: 0=False, 1=True, ≥2=NULL) were previously returned as raw integers 0/1/2. They now show as True, False, or None.
  • NULL-only columns — columns with no payload were silently dropped. They now appear in the table with all-None values so the column is visible.

The SQL Bar and Cross-Snapshot Queries

When the Tables tab opens, crush loads the decoded Realm data — from both snapshots — into a temporary in-memory SQLite database. The full SQL bar is available: SELECT, WHERE, ORDER BY, aggregates, and more. The temp database is deleted automatically when the tab is closed.

Because both the active and previous snapshot tables are present in the same database, cross-snapshot comparisons are possible directly in SQL. For example, to find rows that exist in the previous snapshot but not in the active one — i.e. deleted records:

SELECT p.*
FROM _prev_class_Evidence p
LEFT JOIN class_Evidence e ON p._objkey = e._objkey
WHERE e._objkey IS NULL

Cross-Table JOINs via _objkey

Beyond cross-snapshot queries, Realm link columns store ObjKey values — the internal identifiers Realm uses to reference objects across tables. Each table in the temporary SQLite database receives a leading _objkey column containing the Realm ObjKey for every row, making cross-table joins straightforward:

SELECT ae.dt_published [Published Date], ae.title [Title], a.share_link, e.last_name [Author]
FROM class_Article a
JOIN class_AuthorEDP e ON a.author = e._objkey
JOIN class_ArticleEDP ae ON a.edp = ae._objkey

ObjKeys are not shown in the table grid — they are only present in the SQL database — so the visual view stays clean while the join capability is fully available.


Inspect Cell

As with the SQLite viewer, right-clicking any cell and selecting "Inspect cell…" opens the content in a dedicated window with auto-detection. If the content is a known format — a binary plist, JSON — it opens directly in the appropriate viewer. Otherwise the hex view is available as fallback.

Realm BLOB columns occasionally contain nested serialized data — Inspect Cell is the fastest way to find out what is in there without leaving crush.

Further Reading

Happy examining. 🐢

Friday, May 1, 2026

Crush-forensics new version - 0.6.0 - is out

Moin! 👋

crush-forensics v0.6.0 is out.

github.com/kalink0/crush-forensics

No dramatic new format support this time, no big UI overhaul. This release is about going deeper, not wider.

Forensic Integrity — Now Tested

One thing that has been on my mind since the start of this project: how do you actually prove that a forensic tool does not touch the evidence?

The answer for crush is now a dedicated forensic test suite — currently 14 tests that verify the tool is safe to run on real evidence. Not standard unit tests. Each test is framed as a forensic property guarantee, grouped into five categories: Source Immutability, No Side Effects, Read-only Media compatibility, Known-output Verification, and Reproducibility.

Every test run also produces a forensic_audit.html — a self-contained, printable report with SHA-256 hashes of the reference corpus.

SQLite WAL Forensics

The SQLite viewer has been significantly extended in v0.6.0, with a focus on WAL archaeology — recovering data from the Write-Ahead Log that would otherwise be invisible to standard tools.

When a -wal companion is present, crush now shows a full frame inventory classifying every frame as Active, Superseded, Uncommitted, or WAL slack — the last category being salt-mismatch frames from a previous WAL cycle, per Sanderson's terminology. Superseded and uncommitted frames are colour-coded amber and blue so the examiner immediately sees whether overwritten or in-flight data exists.

Enabling the Show WAL history toggle appends rows decoded from non-active frames directly below the live table data, colour-coded, with a row count showing how many additional records were recovered. Double-clicking any WAL frame opens the raw page bytes in the hex viewer.

On a practical note: crush now also discovers -wal and -shm companions when a .db file is opened directly from the filesystem — previously these were silently skipped when not opening from inside an archive or folder.

More SQLite Improvements

Beyond WAL forensics, the SQLite viewer gained a number of improvements that make day-to-day examination faster.

Timestamp decoding is now built in: right-click any column header and decode it as Unix seconds, Unix milliseconds, Unix microseconds, Mac Absolute Time, Windows FILETIME, or Chrome/WebKit time — displayed as YYYY-MM-DD HH:MM:SS UTC, with sorting still working correctly on the underlying numeric value.

Two new combo entries give quick access to schema and configuration information: DB Structure lists all tables, views, indexes, and triggers with their definitions; DB Info shows 28 PRAGMA settings with decoded values and descriptions. Database views are now browsable like tables. The SQL bar now accepts PRAGMA statements, shows syntax highlighting, and supports running only a highlighted selection — useful for stepping through a complex query piece by piece.

Paste & Decode

A small but useful addition: Tools → Paste & Decode… lets you paste raw hex, base64, or plain text and open it immediately in any supported viewer — without saving anything to disk. If you have been copying BLOB values out of a hex editor or a network capture and opening temporary files just to inspect them, this should save a few steps.

Export as .logarchive

iOS diagnostics nodes now have an "Export as .logarchive…" right-click action. crush assembles the correct logarchive layout — diagnostics tree plus the uuidtext/ sibling — and copies the result to a location of your choice. The output is a standard .logarchive folder that can be opened directly in Console.app or processed with the log CLI. As I do not have a Mac test system this functionality isn't tested yet.

Parallel Unified Log Conversion

Multi-Log Studio now splits large logarchives across multiple unifiedlog_iterator processes — one per physical core by default — and streams entries into the viewer as each chunk finishes. On a typical 200 MB acquisition this gives roughly a 25% reduction in wall time. Unified Log support is still alpha, and there is more work ahead, but it is moving in the right direction.

File Format Database

All entries in the built-in file format database have been reviewed and refined — descriptions improved, forensic context expanded, reference URLs added. Five new formats were added: Apple Keychain, Android Keystore, iOS Backup (iTunes/Finder), Windows Prefetch, and Gzip.

What's Next

The SQLite viewer is the first to receive this level of dedicated depth and tooling — and it won't be the last. The plan is to work through all parsers and viewers systematically, one at a time, done properly. A dedicated deep-dive post on the SQLite viewer is also in the works, with the same planned for other parsers and viewers down the line.

As always: feedback, issues, and pull requests are welcome on GitHub.

Just keep swimming. 🐢

What Hides in the WAL — SQLite Forensics with crush

Moin! 👋

SQLite is everywhere in digital forensics. iOS, Android, macOS, Windows — virtually every app that needs to store structured data uses it. Messages, call logs, browser history, location data, app state — if it is on a mobile device, there is a good chance it lives in a SQLite database. If you want to go deeper on the file format itself, the official SQLite file format documentation is the right starting point.

With v0.6.0, the crush SQLite viewer has grown significantly. This post walks through what it can do — and more importantly, what the features actually mean for your investigation.

Opening a Database

crush opens SQLite databases directly from inside ZIP or TAR archives, from folders, or as standalone files. No extraction needed. When you select a SQLite file, the SQLite viewer opens automatically. The combo box at the top lets you switch between tables, views, and the generated entries described below.

One important detail: crush opens all SQLite connections in read-only mode. This matters because closing a WAL-mode database that was opened with a read-write connection triggers an automatic checkpoint — which commits and clears the Write-Ahead Log, potentially destroying forensic data. More on WAL below.

The Summary View

The default view when opening a database is the Summary. It shows all tables and views with their row counts, and a status label giving the full schema object count — tables, views, indexes, and triggers at a glance.

This is useful for orientation before diving in: how many tables are there, which ones have data, and are there any views worth looking at?

DB summary overview in crush

DB Structure

The DB Structure entry in the combo box lists every schema object in the database — tables, views, indexes, and triggers — with structural details: column definitions for tables, CREATE SQL for views, index definitions, and trigger signatures.

When I open an unfamiliar database, this is where I start, direclty after taking a look on the summary. What tables are there? What do the columns tell me about what the app stores? I also pay attention to indexes — they often hint at what the app queries frequently — and triggers, which can automatically update timestamps or delete records without the user doing anything. Understanding the schema is the foundation for everything that follows

DB structure view in crush

DB Info — PRAGMA Settings

The DB Info entry shows 28 SQLite PRAGMA settings with their current values and plain-English descriptions. PRAGMA statements are essentially configuration queries — they reveal things like the page size, the journal mode, the encoding, and the application ID.

The journal mode is particularly relevant: a value of WAL means the database uses a Write-Ahead Log, which has significant forensic implications (see below). The application_id PRAGMA can sometimes identify the application that created the database. The user_version field is used by many apps to track database schema versions — useful for understanding which version of an app created or last modified the database.


DB Info PRAGMA view in crush


Timestamp Decoding

Timestamps in SQLite databases are stored as raw numbers — but the epoch and unit vary wildly between apps and platforms. Right-clicking any column header and selecting "Decode column as timestamp" lets you tell crush how to interpret the values.

Supported formats:

  • Unix seconds — seconds since 1970-01-01 UTC (most common on Android)
  • Unix milliseconds — same epoch, millisecond precision
  • Unix microseconds — same epoch, microsecond precision
  • Mac Absolute Time — seconds since 2001-01-01 UTC (Apple's CoreData default)
  • Windows FILETIME — 100-nanosecond intervals since 1601-01-01 UTC
  • Chrome / WebKit time — microseconds since 1601-01-01 UTC (Chrome history databases)

The decoded values are displayed as YYYY-MM-DD HH:MM:SS UTC and the column header shows the active format as a suffix — for example created_at [mac abs]. Sorting still works correctly because the raw numeric value is preserved internally. Select "Clear timestamp format" to revert to the raw value.

If you are unsure which format a column uses, a quick sanity check is to look at the magnitude of the values: Unix seconds for recent dates are around 1.7 billion; Mac Absolute Time values for recent dates are around 750 million; Chrome time values are around 13 trillion.

The following pictures show one column converted, the other in original state.

Timestamp decoding in crush SQLite viewer

Just to mention it: It is also possible to convert timestamps directly in the SELECT query - this is basic SQLite-Syntax one can use (Reference: https://sqlite.org/lang_datefunc.html ). 

Inspect Cell

BLOB cells are common in SQLite databases — and they often contain the most interesting data. Right-clicking any cell and selecting "Inspect cell…" opens a dedicated window that tries to auto-detect the content format. If it recognises something — a binary plist, xml, base64 (more format to come) — it opens the data directly in the appropriate viewer. If auto-detection does not find a match, I can manually select from the available formats or fall back to the hex view.

In practice I use this constantly. A BLOB that looks like noise in the table view often turns out to be an bplist or base64  — and Inspect Cell gets me there in two clicks without touching the filesystem.

Inspect cell in crush SQLite viewer

There are currently not so many formats supported it the Inspect Cell command - but I will get there soon with more formats.

The SQL Bar

The SQL bar lets you run custom queries directly against the open database — useful when you want to filter, join, or aggregate data beyond what the table view shows. A few things worth knowing:

  • PRAGMA statements are accepted alongside SELECT and WITH
  • Selected text only — if you highlight a fragment of a query and press F5, only the selection is executed. Useful for debugging a complex query step by step
  • Syntax highlighting — keywords, strings, numbers, and comments are highlighted
  • Status feedback appears below the input field — red on error, normal on success
  • The pane between the SQL bar and results table is resizable


SQL bar in crush SQLite viewer


WAL Forensics — Going Deeper

This is where things get forensically interesting. SQLite has two journal modes: rollback (the default) and WAL (Write-Ahead Log). In WAL mode, changes are not written directly to the database file — they are first written to a separate -wal file and later checkpointed (merged) back into the main database. Most modern iOS and Android apps use WAL mode.

The forensic value of the WAL comes from what survives in it. For a detailed explanation of the WAL format and its forensic implications, I highly recommend Sanderson Forensics' write-up on WAL forensics — it is the reference text on this topic. The book from him and some other great persons in our field is also a standard reference on SQLite forensic including wal-files - definitely worth a read in my opinion and still valid info after all the years.

Sanderson, Paul. SQLite Forensics. Independently Published, 2018. ISBN: 978-1980293071.

In crush, when a -wal companion is present alongside a database, a new WAL Frames entry appears in the combo box. It shows a full frame inventory with every frame classified:

  • Active — the current version of a page, reflected in the live database view
  • Superseded — an older version of a page that has since been overwritten by a newer frame. The older version of the data may still be readable here
  • Uncommitted — frames that were written but never committed — the transaction was rolled back or the app crashed
  • WAL slack — frames from a previous WAL cycle (salt mismatch). After a checkpoint, new WAL writes start from the beginning of the file, overwriting old frames — but until that happens, old frames from the previous cycle remain

Superseded and uncommitted frames are colour-coded amber and blue so you immediately see whether overwritten or in-flight data exists. The Table column shows which schema object owns each page.


SQLite WAL frame inventory in crush


Enabling the Show WAL history checkbox in the table toolbar appends rows decoded from Superseded, Uncommitted, and WAL-slack frames directly below the live data — colour-coded, with a row count showing how many additional records were recovered. Double-clicking any WAL frame row opens the raw page bytes in the hex viewer.

The DB Info view also shows a WAL summary when a companion is present: total frames, active / superseded / uncommitted / WAL-slack counts — a quick health check before diving into the frame inventory.

Paste & Decode

One more addition worth mentioning in the context of SQLite work: Tools → Paste & Decode… lets you paste raw hex, base64, or plain text and open it in any supported viewer — including the SQLite viewer — without saving anything to disk. If you have been copying BLOB values out of a cell and creating temporary files just to inspect them, this saves a few steps.

A Note on WAL and Evidence Handling

To close with something practical: if you open a WAL-mode database with a standard SQLite tool in read-write mode, the automatic checkpoint will commit and clear the WAL before you see anything. This is silent and irreversible on your working copy.

crush prevents this by opening all connections with the mode=ro URI flag. If you are working with SQLite databases outside of crush, always use PRAGMA query_only = ON or open with a read-only URI to protect the WAL. The forensic test suite in v0.6.0 includes a dedicated test that verifies this behaviour on every build.

Happy examining. 🐢

Saturday, April 25, 2026

Introducing crush: A DFIR Workbench for Surfing Through Data Formats

Moin! 👋

Today I want to share something a little different from the usual artifact analysis posts — I am releasing crush-forensics v0.5.0, a digital forensic analysis workbench I have been building for a while.

You can find it on GitHub: github.com/kalink0/crush-forensics

The Problem

When I work with acquisitions — especially mobile ones — I often find myself wanting to quickly look at a specific file without firing up a full forensic platform. Maybe I want to check a PLIST, peek into a SQLite database, or just confirm what a binary blob actually contains. Opening the whole acquisition in a heavy tool just to answer a quick question felt like overkill.

At the same time, reaching for a hex editor or writing a one-off script every time also gets old fast. I wanted something in the middle: a lightweight, dedicated workbench that knows about the file formats we actually work with in DFIR — and that lets me navigate directly inside ZIP and TAR archives without extracting anything to disk first.

That is what crush is.

Why "crush"?

I am a big Finding Nemo fan. Crush is the laid-back sea turtle who surfs the East Australian Current — and that is exactly the vibe I wanted for this tool: just riding through data formats, going with the flow. 🐢

The idea of surfing through file formats felt like the right metaphor — you open an archive, navigate the structure, and glide from a PLIST to a SQLite DB to a hex view without fighting the current. Dude.

What Can It Do?

crush is a Python-based GUI application (built with PySide6). It supports:

  • Opening and navigating inside ZIP and TAR archives directly — no extraction to disk needed
  • Opening single files and folders
  • Export files/folders and open them directly in external software
  • Hex Viewer
  • SQLite Viewer
  • Text Viewer with syntax highlighting and encoding detection
  • JSON Viewer (collapsible tree)
  • XML Viewer (collapsible tree)
  • PLIST and BPLIST Viewer
  • SEGB v1 and v2 Viewer
  • ABX Viewer (Android Binary XML)
  • LevelDB Viewer (Chrome LevelDB / Android app databases)
  • Image Viewer
  • Media Viewer (audio and video)
  • Multi-Log Studio — multi-source log analysis with format auto-detection, including Apple Unified Log / .tracev3 / .logarchive, syslog, and more (note: Unified Log support is currently alpha — decoding can be slow, this is actively being worked on)
  • Protobuf Viewer — schema-less, with optional schema decoding
  • PDF text extraction
  • Realm Database Viewer — header, schema/class extraction, top-ref comparison, table/column data decoding

Android ABX viewer in crush (Linux)



iOS SQLite viewer in crush (Windows)


Built-in Data Format Database

One feature I am particularly happy with is the built-in data format database. Crush identifies forensically relevant formats by magic bytes and extension, and surfaces the information directly in the UI for every selected file — including formats that do not have a dedicated viewer yet.

For each format it shows:

  • Full name and abbreviation
  • Category (database, configuration, log, ...)
  • Forensic relevance — what an investigator is likely to find here
  • Relevant platforms (iOS, Android, macOS, Windows, ...)
  • Magic bytes with offset and description
  • Links to format specs and relevant forensic research

Format Reference — the built-in data format database in crush


So for example, when you open an ABX file, crush identifies it via its magic bytes, opens it in the ABX viewer, and right there in the UI you can see that this is an Android Binary XML file used for system and app settings — plus links to the AOSP source and relevant research from CCL Solutions. No more alt-tabbing to a browser to remember what a format is.


Integrity Mode

Crush also has an optional integrity mode for auditability. When enabled:

  • Records SHA-256 hashes when files are opened or exported
  • Hashes ZIP/TAR/file sources on open (folders are not hashed)
  • Writes hashes to the log
  • Creates a crush-export-hashes.txt file next to exported data

You can toggle it via the status badge in the bottom right of the UI. It can also be turned off for faster opening of large ZIP/TAR sources. A small but useful addition for anyone who needs to demonstrate that what they examined is what they got.


Integrity Mode dialog in crush

Installation

v0.5.0 ships with pre-built binaries for Windows, Linux, and macOS — available directly from the GitHub releases page. No Python environment needed, no dependency wrangling — just download and run.

One honest caveat: the macOS binary is currently untested as I do not have a macOS system available. If you are on macOS and give it a try, please let me know how it goes via GitHub issues or direct message — that feedback would be really valuable.

If you prefer to run from source (e.g. for development), the README has full instructions, including the platform-specific system dependencies and how to download the Unified Log parser binaries.

Credits

Crush builds on some excellent work from the DFIR community. Bundled third-party modules include:

  • ccl_bplist by CCL Solutions Group — binary plist parsing
  • ccl_segb by CCL Solutions Group — SEGB parsing
  • ccl_leveldb by CCL Solutions Group — LevelDB / Chrome LevelDB parsing
  • macos-UnifiedLogs by Mandiant — Apple Unified Log parsing

Thank you to everyone in the community whose research and open source work made this possible.

Nightly Builds

If you want to live on the edge — every night a fresh build is automatically created from the latest main branch, including the most recent features and fixes. Changes are tracked in the changelog.md in the repo. Not recommended for production casework, but great if you want to try out what is coming next or help with testing.

What's Next?

There is still a lot I want to add — more decoders, better search, maybe some lightweight timeline capabilities. I will keep building as I run into things I wish it could do during actual casework.

If you try it out, I would love to hear what features would be most useful to you. Open an issue on GitHub, or reach out to me on Mastodon, Bluesky, or LinkedIn.

Until next time — happy forensicating! 🔍

GitHub: github.com/kalink0/crush-forensics

Saturday, March 14, 2026

xLEAPP - Helper scripts for pulling/cloning and creation of Windows exe

Hey there,

I was asked if there is an easy way to have the latest xLEAPP (ALEAPP and iLEAPP) version on Windows with all the most recent parsers from the main repositories.

I created batch scripts for this - they can be found in my github repo "useful_scripts" at https://github.com/kalink0/useful_scripts/tree/master/windows/leapp-build

The start_update*.bat files pull from the most current main branch from the original repositories (https://github.com/abrignoni/ALEAPP.git and https://github.com/abrignoni/iLEAPP.git), create a venv and than install all necessary requirements. 


After this, the user can execute ALEAPP/iLEAPP with executing the batch file. At every execution the batch file asks if it should get the newest files from the repos if not it just executes xLEAPP the way the user wants it to.

Additionally I created create_exe*.bat files. These files create executables via pyinstaller based on the existing clone of the repo on the machine for Windows. These executables can than be used on other machines without the necessity of installed python or git or an active Internet connection.

There are several versions of each batch file, one for ALEAPP, one for iLEAPP.

Python and git must be installed on the machines that execute the script.

Perhaps these scripts are also useful for other person - so feel free to use.


Bubbly now with GUI - New Release 1.2.4

 Hey there, I did a bit of work on my small application Bubbly.

What's new?

First and most important:

I implemented a GUI for Bubbly itself AND a GUI for the csv to bubbly json converter - now you are able to open a csv and map the columns easily and also convert the timestamp into the necessary format - So it is much easier to get the necessary input file to use in Bubbly.










Additionally I added a parser for the Android Romeo Dating App - SQlite (Already in Version 1.1.0) - You can directly input the sqlite Table of this Android App and so create a bubble view of all the chats.

New version (executable for Windows and Linux) can be found at https://github.com/kalink0/bubbly/releases/tag/v1.2.4

Next step is to allow other file formats for the Bubbly Converter to make it more versatile.

Saturday, February 21, 2026

Samsung Trash Provider App - Traces of deleted files never hurt

This post shows my result of my research on the Samsung Trash Provider App on Samsung Android devices.

Android Version Tested:
Android 16

Test device:
Samsung Galaxy S23

Database location:
[...]/com.samsung.android.providers.trash/databases/trash.db


What is this app about?

The Samsung Trash Provider App is not an app a user directly interact with. The app is used by other apps (e.g. MyFiles and Gallery App) to offer trash bin functionality. In earlier OneUI/Android versions this worked differently. There every single app had its own trash functionality implementation.
This means it is still possible, that trash data is also stored in the individual app.

What's in the databases?


In the database "trash.db" is one interesting table, it is called "trashes". With the following interesting columns.

  • _data - Current file path
  • original_path - The original File path
  • _id - Some file id
  • media_id - an additional id, empty for my data
  • sec_media_id - id in the corresponding Samsung app, was filled for Gallery app and matched the idea there in my data
  • is_cloud - for my data always 1 - but I did not have any cloud sync active - so meaning not known atm
  • volume_name - The name of the volume the data is stored on
  • title - In my data always the file name with file extensions
  • _display_name - What is shown in the apps - identical to file name in my data
  • _size - Size of file in bytes
  • mime_type - Recognized MIME type of the file
  • media_type - 1 = Image, 2 = Audio, 3 = Video, Pretty sure there are more types but I only have these in my data atm
  • datetaken - if media was taken on the phone this value is set - didn't found another case were it was filled
  • date_expires - the date/time this file will be deleted completely from trash
  • date_deleted - the date/time the file was moved to trash
  • user_id - which user account deleted it (0 = main user, 150 = secure folder context)
  • extra - additional info from the source app, JSON based - can hold exif data (location, timestamp taken etc.)


Example SELECT query:

SELECT
    _id [File ID],
    _data [Trash File Path],
    original_path [Original File Path],
    title [File Titel],
    _display_name [File Name],
    _size [File Size],
    mime_type [MIME Type],
    delete_package_name [App Context],
    is_cloud [Cloud?],
    user_id [User ID],
    strftime('%Y-%m-%d %H:%M:%S.', "date_deleted"/1000, 'unixepoch') || ("date_deleted"%1000) [Deletion Timestamp],
    strftime('%Y-%m-%d %H:%M:%S.', "date_expires"/1000, 'unixepoch') || ("date_expires"%1000) [Expiration Timestamp],
    extra [Extra Info JSON]
FROM trashes
        

Location of the deleted files

The files are stored in the media part of the Android system under Android/.Trash
Full path can be e.g. /data/media/0/Android/.Trash/[...]

The pattern in this is as follows:

[appcontext]/[file_uid]/[old_file_path]/.!%#@$/[old_file_name]

Example:

/storage/emulated/0/Android/.Trash/com.sec.android.app.myfiles/f00f4be8-7dad-4996-aedb-6edc4b9f42d3/1763651528662/storage/emulated/0/Music/Telegram/.!%#@$/deleted_file.ogg


With this info, even if we do not have any database access, we can tell a bit about the file. From the example above:

  • App Context - The files was deleted via the App Samsung My Files
  • Old File Path - The original File path was /storage/emulated/0/Music/Telegram/
  • Old File Name - The original File Name was deleted_file.ogg
This fits what I've done on my test phone.


Additionally Good to know

Based on the values for date_expires and date_deleted one can say that per default the files will stay 31 days in the trash and than will be deleted automatically.

But, in my test data - the entry in the database was kept in it - so file was gone, trace of it was still there.

I recovered one file (undeleted via GUI), the entry is immediately deleted from the database and the file is moved back to its original location.


Conclusion

1. The Samsung Trash Provider App stores info on deleted files. Also from other Apps like Samsung Gallery and MyFiles.
2. The files are kept for 31 days in the Trash
3. Even if we only have the files in the default trash folder location (e.g. /media/data/0/Android/.Trash/) we can say a few things about the files. E.g. in which app context it was deleted, what was the original path and the original file name


I've created an ALEAPP parser for this. It is part of an already open PR (#665) waiting to get pulled into the ALEAPP repo.







Sunday, February 15, 2026

BUBBLES!!! BUBBLES! [gibbering] MY BUBBLES! - Chats (export/backups) need bubbles

In the last few days I took some time to develop a few things. And it has to do with bubbles ;-)

I developed "Bubbly" - a small tool that parses different Messenger Export and Backup formats and visualize the data in HTML Bubble View. The data is filterable, searchable, taggable and media files are supported.

The tool can be found at https://github.com/kalink0/bubbly. It is Python based CLI-application and I have built versions for Linux and Windows on Github.

How do the bubbles look?

The Overview HTML

When running the tool an overview page is created that shows the created thread views. One can also decide to put all chat threads in one view. Then only one would be shown on the overview page.



The Chat Thread HTMLs

The created thread view that shows the data with the bubbles is divided into three parts.

1. The header with general info:

 This part can be collapsed. And one can load or export created tags and go back to the overview page

2. The control panel:


Which contains the filter and search functionality.

3. MY BUBBLES!


Tagging is possible via the button in the bubbles (only three predefined tags at the moment).


Current implemented parsers

Currently I have the following parsers implemented:

1. WhatsApp Chat Export

This parser parses the Chat Export from the iOS and Android WhatsApp app.

Because the Output is different based on the system language -> currently fully supported are German and English. But I work on a more robust implementation.

2. Wire Messenger Backup

Parses Wire backup files - only unencrypted backups are supported atm.

3. Telegram Desktop Chat Export

Parses Chat Exports frorm the Telegram Desktop App. 

4. Threema Backup Parser

Parses Backup of the Messenger Threema. But: the data needs to be decrypted before parsing - no decryption feature.

5. Generic JSON Parser

I've built a generic JSON parser with a simple JSON design. Also a mapping script is in my repo. The main idea: just create the simple json for a messenger, Bubbly is doing the rest to get the bubbles.

Additional functionality

1. Branding - it is possible to give a logo that is shown in the top left corner
2. General case information - Case number, name of creator are necessary to create an HTML
3. Logging - the read files and written files are documented including the setting used. (Not the media files)
4. Config File usage
5. CLI interactive mode


Why did I develop it?

Main reasons:

- Easy to use and to hand over to other persons - especially the Bubble View and the media file support in a directly usable HTML.

- Fast to implement new parsers for not so well known formats/messengers (Exports and Backups)

- Basic filtering, searching and tagging functionality

Well, I needed a fast way to visualize the basic information from Exports and Backups of different messengers. That is what I focused on. And the bubble view is the one that everyone - also non technical persons, know and understand, so I wanted this view.


The limitations - or - what is Bubbly NOT

Bubbly is not a "Forensic Chat Analysis Tool". It focus on the data from Exports and Backups and gives additional possibility to create bubble view of other messengers with mostly less functionality. I don't plan to support any parsing of the well known messengers like Signal, WhatsApp, Telegram on Android and iOS with full functionality support for all they have (e.g. statuses, answering messages, forwarding messages, reacting etc.) with Bubbly. There are other tools out there to do this. 

I want to keep it as simple as possible.

Also there won't be any export functionality - I thought about it first - but the simple used platform and the use of web-browsers limit the functionality. Tagging and export/import of them is possible - so one can share them at least.

I tested Bubbly with data sets with around 20000 messages per thread (incl. media files) - without any problem - but there will be a limit for sure ;-)


Saturday, January 10, 2026

Inside Android Samsung DHMS: Extracting Device Health, Thermal, and App-Control Data from Android - Part 1

As mentioned in my last post, I am currently researching Samsung system apps on Android.

In this post, I share some of my findings on the Samsung Device Health Management Service (SDHMS).

As the name suggests, SDHMS is responsible for monitoring the overall health of the device — including power consumption, temperature, and system load. It detects abnormal behavior (“anomalies”) in running apps, packages, and processes.

To do this, SDHMS records system health data. Some of this data may be valuable for forensic analysis.

App path: /data/data/com.sec.android.sdhms


Anomalies

First database I examined was anomaly.db

This database stores information about detected anomalies. On my test device, no anomalies were triggered, but one table stood out as particularly interesting to me:

Table "config_history": 

This table has four columns:
  • time - Timestamp of the entry, stored as Unix epoch time in milliseconds
  • config_key - The key of the configuration that was loaded
  • config_version - The version of the loaded configuration
  • reason - The reason why the entry was generated
In my test data, most entries in the reason column are BOOT_COMPLETED. I checked the last five entries, and they indeed correspond to my device reboots.

My results span approximately 1.5 years back, covering the full time span the device was in use. Additionally, I found entries with SYSPACKAGE_REMOVED, which likely indicate that a system package was removed from monitoring. This is less relevant for my current analysis.

From this, I was able to reconstruct a history of device (re)boots for my test device.

Table "anomaly_history":

On my test device, this table does not contain any data.

Based on the available columns, it appears that this table would record information about apps or packages that triggered an anomaly, including the type of anomaly and the timestamp of the event.

Generating meaningful test data for this table is challenging. Nevertheless, it seems designed to provide historical insight into anomalous app behavior when anomalies are detected.

Table "current_config":

I am confident that this table contains the current configuration or a reference to the config file.

  • time - Timestamp of the entry in Unix epoch milliseconds
  • config_key - Key of the configuration loaded
  • config_version - Version of the loaded configuration
  • config_data - The configuration data itself

The config_key and config_version correspond exactly to those referenced in the config_history table.

The config_data column holds a large amount of characters — for my device, 7,413 characters — which appears to be Base64-encoded.

Using CyberChef, it was possible to decode it:

1. From Base64

2. Protobuf Decode

3. JSON -> results in readable, structured data

The decoded content contains a lot of information, such as descriptions of anomaly types configured on the device. While it may not be immediately relevant to my current analysis, knowing how to read and decode this data will perhaps be useful in other cases.

Thermal Log

The next database I examined was thermal_log, which — as the name suggests — is responsible for thermal and system performance logging.

Table "NETSTAT": 

As the name implies, this table stores network usage statistics.

For my test device, data spans approximately 5 days.

  • start_time - The start timestamp of measurement time window, Unix epoch time ms
  • end_time - The end timestamp of measurement time window, Unix epoch time ms
  • package_name - Package Name - related to the package that was measured
  • uid - The App ID on the device
  • net_usage - Bytes transferred in the measurement time window
This table allows you to track network activity per app over time.

Table "CPUSTAT": 

The CPUSTAT table seems to store data on CPU Usage statistics.

Again, for my test device, the data covers roughly 5 days.

  • start_time - The start timestamp of measurement time window, Unix epoch time ms
  • end_time - The end timestamp of measurement time window, Unix epoch time ms
  • uptime - Uptime in seconds
  • cputime - Total CPU time used by process since last boot
  • process_name - Name of the process measured
  • uid - Package ID related to the process
  • pid - Process ID
  • process_usage - CPU time used in measurement time windows

Note: CPU time values are scaled and depend on the number of cores. In general, the higher the process_usage value, the higher the CPU utilization and load generated by the process.

Table "TEMPERATURE":

This table contains temperature sensor data for the device.

For my test device, data covers roughly 5 days.

  • timestamp - The start timestamp of measurement time window, Unix epoch time ms
  • skin_temp - Chassis Temperature
  • ap_temp - Processor Temperature
  • bat_temp - Battery Temperature
  • usb_temp - USB Temperature
  • chg_temp - Charging IC Temperature
  • pa_temp - Pulse Amplifier Temperature (Cell Radio)
  • wifi_temp - WiFi Temperature

Notes on values:

  • Temperatures are stored in degrees Celsius × 10, so divide by 10 to get °C.
  • On devices with different regional settings, values may instead be stored in degrees Fahrenheit.
  • Not all devices provide all sensor data, and some columns may aggregate readings from multiple sensors depending on the Hardware.
In general, this table allows you to get an overview of the device’s thermal state.

Conclusion

In the anomaly.db database, we can extract information such as when the device was rebooted. On my test device this data goes back for 1.5 years - the complete usage time frame of the device.

The thermal_log database provides detailed insights into system activity, including:

  • CPU load generated by apps and processes
  • Network usage per app
  • Device temperature across multiple sensors

On my test device, this data spans roughly 5 days.

What's next?

There is still additional data stored by the app that I haven’t covered in this post. For example:

  • Databases containing battery usage per app for the last 7 days — these require some mapping of numeric package IDs to package names.
  • Log files that record e.g. which app was in use and which SIM card (slot) was active.

I will cover these findings and share further insights in a follow-up post on SDHMS.

I created parsers for ALEAPP for the results shared here - PR is open.