Parsing IFC Files for the Web: A Developer's Guide to Open BIM Data

Why Parse IFC Files?

Every BIM model eventually needs to leave its authoring tool. Architects export from ArchiCAD, structural engineers from Tekla, MEP engineers from Revit — and the common format they all speak is IFC (Industry Foundation Classes).

If you're building web tools for the construction industry — whether that's a model viewer, a quantity takeoff tool, an automated code checker, or an HVAC calculation platform — you need to read IFC files. The problem: IFC files are dense, deeply nested, and use a format (STEP/ISO 10303-21) that most people in the industry have never looked inside.

This post explains what's inside an IFC file, how to extract useful data from it, and how to make that data available to web applications.

What's Inside an IFC File?

An IFC file is a text file (usually with a .ifc extension) containing entity instances encoded in the STEP format. Each line defines a building element with a unique ID, a type, and a set of attributes — things like a name, a global identifier, a placement in 3D space, and a geometric shape.

The entity types form a deep hierarchy. A duct segment, for example, is a specific type of flow segment, which is a type of distribution element, which is a type of building product. This hierarchy matters because it determines what properties and relationships each element can have.

Every element connects to other elements through relationships. A duct segment belongs to a distribution system (via a group assignment relationship). A room is contained in a building storey (via a spatial containment relationship). A wall has a set of properties (via a property set relationship). Understanding these relationships is the key to extracting useful data from IFC.

The Tool: IfcOpenShell

IfcOpenShell is the de facto open-source library for working with IFC files. It provides Python and C++ APIs for reading, writing, and manipulating IFC data. It handles the STEP parsing, entity resolution, and type hierarchy, so you can focus on extracting the data you need.

With IfcOpenShell, you can:

  • Open an IFC file and inspect its schema version (IFC2x3 or IFC4) and total entity count
  • Query elements by type — get all walls, all duct segments, all pipe segments, all spaces
  • Read properties — access both standard property sets (defined by the IFC specification) and custom property sets (added by the authoring tool or firm)
  • Traverse relationships — find which system a duct belongs to, which storey a room is on, which building contains a storey

Extracting MEP System Data

For HVAC and MEP applications, the most valuable data in an IFC file is the flow system data — ducts, pipes, fittings, terminals, and the systems they belong to.

Here's the extraction approach:

Step 1: Collect Duct and Pipe Segments

Query the model for all duct segments and pipe segments. For each one, extract:

  • Identity: Global ID and element name
  • System membership: Which distribution system does this element belong to? (e.g., "Supply Air System 1" or "Domestic Hot Water"). This requires traversing the group assignment relationships.
  • Dimensions: Nominal diameter (for round ducts/pipes), width and height (for rectangular ducts), and length. These come from standard property sets like "Pset_DuctSegmentTypeCommon" and quantity takeoff sets like "Qto_DuctSegmentBaseQuantities."
  • Material: The material specification, if available.
  • Location: The building storey that contains the element, and its 3D coordinates.

Step 2: Handle Property Sets

IFC organizes element data into property sets (Psets). There are two kinds:

  • Standard Psets (names starting with Pset_ or Qto_) — Defined by the IFC specification, consistent across all authoring tools. Examples: Pset_DuctSegmentTypeCommon contains nominal diameter and shape; Qto_DuctSegmentBaseQuantities contains length and surface area.
  • Custom Psets — Defined by the authoring tool or the firm's BIM standards. This is where you'll find things like insulation thickness, fire rating, system abbreviations, and other project-specific data. Naming conventions vary wildly between firms.

Step 3: Resolve Spatial Hierarchy

Every element exists somewhere in the building's spatial structure: Site → Building → Storey → Space. To find which storey a duct segment is on, you traverse the spatial containment relationships. Some elements are directly contained in a storey; others are nested within spaces or zones.

Serving IFC Data Through a Web API

Once you can extract data from IFC files, the next step is making it available to web applications. The typical approach:

  1. Upload endpoint — Accept an IFC file via HTTP upload
  2. Parse and extract — Run the extraction logic server-side (Python with IfcOpenShell)
  3. Return structured JSON — Send back the extracted data in a clean, well-structured format

A useful API response for MEP data might include:

  • Summary level: Total segment count, number of ducts vs. pipes, list of discovered systems
  • System level: For each system — name, element count, total length, element types
  • Element level: For each segment — ID, name, type, system, dimensions, material, level, coordinates

For production use, it makes sense to parse the IFC file once and store the extracted data in a database (PostgreSQL works well). Subsequent queries hit the database rather than re-parsing the file, which matters when files can be 500MB+ for a full building model.

IFC4 vs IFC2x3: Differences That Matter

You'll encounter two major IFC versions in practice:

FeatureIFC2x3IFC4
MEP systemsGeneric system entitiesSpecific distribution system entities
Ports/connectionsBasic port definitionsBetter defined, more reliable
Property templatesBasicReusable property template definitions
GeometryMostly solid geometryAdds tessellated geometry, better curves
File sizeTypically smallerLarger due to richer data

The practical impact: when extracting system data, you need to handle both versions. IFC4 uses more specific entity types for distribution systems, while IFC2x3 uses a more generic system entity. Your extraction logic should check for both.

Practical Use Cases

Once you can parse IFC files, the applications are broad:

  • Automated QA — Validate that every duct segment has an insulation thickness specified, every pipe has a fire rating, every space has a design temperature. Catch missing data before model submission.
  • Quantity takeoff — Sum up total duct length by diameter, pipe length by material, number of fittings by type. Feed this directly into cost estimation.
  • Clash pre-check — Before running expensive clash detection in Navisworks, do a bounding-box pre-check to identify areas with high element density.
  • Data validation — Cross-reference MEP elements against an Information Delivery Specification (IDS) to check compliance before submission.
  • Calculation input — Extract room volumes, envelope areas, and thermal properties to feed into HVAC calculation tools like Mepbau. This is one of the most exciting applications — turning a BIM model into automatic engineering results.

Performance Tips for Large Models

IFC files can be massive — 500MB+ for a full building model with MEP systems. A few strategies:

  1. Filter early. Only load the entity types you need. If you only care about ducts and pipes, don't iterate over walls, doors, and furniture.
  2. Cache property lookups. Traversing relationships to extract property sets is expensive when done for thousands of elements. Extract properties once and cache the results.
  3. Parse once, query many. For production applications, parse the IFC file once on upload and store the extracted data in a database. All subsequent operations query the database.
  4. Stream large files. For very large models, use iterator-based parsing rather than loading the entire file into memory at once.

What's Next

IFC parsing is the foundation for a whole class of web-based BIM tools. The ability to read building data from a universal format, extract the parts you need, and serve them through a web API opens up possibilities that desktop-only tools can't match.

In the next post, I'll show how to take MEP data extracted from IFC and feed it into an HVAC calculation engine — turning a Revit model into automatic heating load results.


Building web tools around IFC data? I'm always looking to connect with people in the open BIM space. Reach out at hello@laborsam.com.