STEP file (p21)

The Clear Text Encoding of the Exchange Structure as defined in part21 (ISO 10303-21) - also called STEP Physical File - is the most widely used exchange format throughout STEP. Due to its ASCII structure it is easy to read with typically one instance per line.

Some details to take care of:

  • The first edition of ISO 10303-21:1994 had some bugs and was fixed by a Technical Corrigendum.
  • The second edition is ISO 10303-21:2002, including all the fixes and extensions for several data sections.
  • Part 21 defined two conformance classes. They differ only in the way of complex entity instance encoding.
    • Conformance class 1, which is always used enforce the so called internal mapping, which is more compact.
    • Conformance class 2, which is not used in practice, always enforces the external mapping. In theory, this would allow better AP interoperability since a postprocessor may know how to handle some supertypes, but may not know the specified subtypes.
  • The 1st edition of part 21 enforces the use of so called SHORT NAMES. which is optional in the 2nd edition. In practice SHORT NAMES are not used currently.
  • The 2nd edition allows for several data sections. In practice only one data section is used (1st edition encoding).


A typical example looks like this:

/* description */ (''),
/* implementation_level */ '2;1');
/* name */ ' ',
/* time_stamp */ '2003-12-27T11:57:53',
/* author */ (' '),
/* organization */ ('LKSoft'),
/* preprocessor_version */ ' ',
/* originating_system */ 'IDA-STEP 1.1.alpha',
/* authorization */ ' ');
FILE_SCHEMA (('AUTOMOTIVE_DESIGN { 1 0 10303 214 2 1 1}'))
#11=PRODUCT_DEFINITION_CONTEXT('part definition',#12,'manufacturing');
#12=APPLICATION_CONTEXT('mechanical design');
#16=PRODUCT('A0001','Test Part 1','',(#18));
#20=ORGANIZATION_ROLE('id owner');

HEADER section

As you can see from the example given above, the file is split into two sections following the initial keyword ISO-10303-21;:

The HEADER section has a fixed structure consisting of 3 to 6 groups in the given order. Except for the data fields time_stamp and FILE_SCHEMA, all fields may contain empty strings.

    • description
    • implementation_level. The version and conformance option of this file. Possible versions are 1 for the original standard back in 1994, 2 for the technical corrigendum in 1995 and 3 for the second edition. The conformance option is either 1 for internal and 2 for external mapping of complex entity instances. Most often you will find here the value '2;1'. The value ''2;2' enforcing external mapping is also possible but only very rarely used. The values '3;1' and '3;2' indicate extended STEP-Files as defined in the 2001 standard with several DATA sections, multiple schemas and FILE_POPULATION support.
    • name of this exchange structure. It may correspond to the name of the file in a file system or reflect data in this file. There is no strict rule how to use this field.
    • time_stamp indicates the time when this file was created. The time is given in the internal data time format ISO 8601, e.g. 2003-12-27T11:57:53 for 27 of December 2003, 2 minutes to noon time.
    • author - the name and mailing address of the person creating this exchange structure.
    • organization - the organization to which the person belongs to.
    • preprocessor_version - the name of the system and its version, which produces this STEP-file.
    • originating_system - the name of the system and its version which originally created the information contained in this STEP-file.
    • authorization - the name and mailing address of the person who authorized this file.
  • FILE_SCHEMA. For version 2 only one Express schema together with the object identifier of the schema can be listed here. From version 3 on this is relaxed for several entries.

The last 3 header groups are only valid from version 3 on.

  • FILE_POPULATION, indicating a valid population (set of entity instances), which conforms to EXPRESS schemas. This is done by collecting data from several data_sections and in addition referenced instances from other data sections.
    • governing_schema - the EXPRESS schema to which the indicated population belongs to and by which it can be validated.
    • determination_method to figure out which instances belong to the population. Three mehods are predefined: SECTION_BOUNDARY, INCLUDE_ALL_COMPATIBLE, and INCLUDE_REFERENCED.
    • governed_sections - the data sections, whose entity instances fully belong to the population.
    • The concept of FILE_POPULATION is very close to schema_instance of SDAI. Unfortunately, during the standardization process it was not possible to achieve an agreement to get these concepts fully synchronized. Therefore JSDAI adds further attributes to FILE_POPULATION as intelligent comments to cover all missing information from schema_instance. This is supported for both, import and export.
  • SECTON_LANGUAGE allows to assign a default language for either all or for a specific data section. This is needed for those Express schemas, which do not provide the capability to specify in which language string attributes of entities, such as name and description, are given.
  • SECTION_CONTEXT provides the capability of specifying additional context information for all or single data sections. This can be used e.g. for STEP-APs to indicate which conformance class is covered by a particular data section.

DATA section

The DATA section contains application data according to one specific Express schema. The encoding of this data follows some simple principles:

  • Instance name: Every entity instance in the exchange structure is given a unique name in the form "#1234". The instance name must consist of a positive number (>0) and is typically less than 2**63. The instance name is only valid locally within the STEP-file. If the same content is exported again from a system the instance names may be different for the same instances. The instance name is also used to reference other entity instances through attribute values or aggregate members. The reference instance by be defined before or after the current isntance.
  • Instances of single entity data types are represented by writing the name of the entity in capital letters and then followed by the attribute values in the defined order within parenthesis. See e.g. "#16=PRODUCT(...)" above.
  • Instances of complex entity data types are represented in the STEP file by using either the internal mapping or the external mapping.
    • External mapping has to be used if the complex entity instance consists of more than one leaf entity. In this case all single entity instance values are given independently from each other in alphabetical order as defined above with all entity values grouped together in parentheses.
    • Internal mapping is used by default for conformance option 1 when the complex entity instance consists of only one leaf entity. The encoding is similar to the one of a single entity instance with the additional order given by the subtype definition.
  • Mapping of attribute values:
    • Only explicit attributes get mapped. Inverse, Derived and redeclared attributes are not listed since their values can be deduced from the other ones.
    • Unset attribute values are given as "$".
    • Explicit attributes which got redeclared as derived in a subtype are encoded as "*" in the position of the supertype attribute.
  • Mapping of other data types:
    • Enumeration, boolean and logical values are given in capital letters with a leading and trailing dot such as .TRUE..
    • String values are given in ' '. For characters with a code greater than 126, a special encoding is used. The character sets as defined in ISO 8859 and 10646 are supported. Note that typical 8 (e.g. west European) or 16 (Unicode) bit character sets cannot directly be taken for STEP-file strings. They have to be decoded in a very special way.
    • Integers and real values are used identical to typical programming languages
    • The elements of aggregates (SET, BAG, LIST, ARRARY) are given in parenthesis, separated by ','.
    • Care has to be taken for select data types based on defined data types. Here the name of the defined data type is mapped too.
  • See also "Mapping of Express to Java" for more details of this.