Network Working Group G. Trewitt Request for Comments: 1023 Stanford C. Partridge BBN/NNSC October 1987 HEMS Monitoring and Control Language This RFC specifies the design of a general-purpose, yet efficient, monitoring and control language for managing network entities. The data in the entity is modeled as a hierarchy and specific items are named by giving the path from the root of the tree. Most items are read-only, but some can be "set" in order to perform control operations. Both requests and responses are represented using the ISO ASN.1 data encoding rules. STATUS OF THIS MEMO The purpose of this RFC is provide a specification for monitoring and control of network entities in the Internet. This is an experimental specification and is intended for use in testing the ideas presented here. No proposals in this memo are intended as standards for the Internet at this time. After sufficient experimentation and discussion, this RFC will be redrafted, perhaps as a standard. Distribution of this memo is unlimited. This language is a component of the High-Level Entity Monitoring System (HEMS) described in RFC-1021 and RFC-1022. Readers may want to consult these RFCs when reading this memo. RFC-1024 contains detailed assignments of numbers and structures used in this system. This memo assumes a knowledge of the ISO data encoding standard, ASN.1. OVERVIEW AND SCOPE The basic model of monitoring and control used in this proposal is that a query is sent to a monitored entity and the entity sends back a response. The term query is used in the database sense -- it may request information, modify things, or both. We will use gateway- oriented examples, but it should be understood that this query- response mechanism can be applied to other entities besides just gateways. In particular, there is no notion of an interactive "conversation" as in SMTP [RFC-821] or FTP [RFC-959]. A query is a complete request that stands on its own and elicits a complete response. Trewitt & Partridge [Page 1] RFC 1023 HEMS Language October 1987 It is not necessary for a monitored entity to be able to store the complete query. It is quite possible for an implementation to process the query on the fly, producing portions of the response while the query is still being received. Other RFCs associated with HEMS are: RFC-1021 -- Overview; RFC-1022 -- transport protocol and message encapsulation; RFC-1024 -- precise data definitions. These issues are not dealt with here. It is assumed that there is some mechanism to transport a sequence of octets to a query processor within the monitored entity and that there is some mechanism to return a sequence of octets to the entity making the query. ENCODING OF QUERIES AND RESPONSES Both queries and responses are encoded using the representation defined in ISO Standard ASN.1 (Abstract Syntax Notation 1). ASN.1 represents data as sequences of triples that are encoded as a stream of octets. The data tuples may be recursively nested to represent structured data such as arrays or records. For a full description of this notation, see the ISO documents IS 8824 and IS 8825. See the end of this memo for information about ordering these documents. NOTATION USED IN THIS PROPOSAL The notation used in this memo is similar to that used in ASN.1, but less formal, smaller, and (hopefully) easier to read. The most important difference is that, in this memo, we are not concerned with the length of the data items. ASN.1 data items may be either a "simple type" such as integer or octet string or a "structured type", a collection of data items. The notation or a "structured type", a collection of data items. The notation: ID(value) represents a simple data item whose tag is "ID" with the given value. A structured data item is represented as: ID { ... contents ... } where contents is a sequence of data items. Remember that the contents may include both simple and structured types, so the structure is fully recursive. There are situations where it is desirable to specify a type but give no value, such as when there is no meaningful value for a particular measured parameter or when the entire contents of a structured type is being specified. In this situation, the same notation is used, Trewitt & Partridge [Page 2] RFC 1023 HEMS Language October 1987 but with the value omitted: ID() or ID{} The representation of this is obvious -- the data item has zero for the length and no contents. DATA MODEL Data in a monitored entity is modeled as a hierarchy. Implementations are not required to organize the data internally as a hierarchy, but they must provide this view of the data through the query language. A hierarchy offers useful structure for the following operations: Organization A hierarchy allows related data to be grouped together in a natural way. Naming The name of a piece of data is just the path from the root to the data of interest. Mapping onto ASN.1 ASN.1 can easily represent a hierarchy by using "constructor" types as an envelope for an entire subtree. Efficient Representation Hierarchical structures are quite compact and can be traversed very quickly. Each node in the hierarchy must have names for its component parts. Although we would normally think of names as being ASCII strings such as "input errors", the actual name would just be an ASN.1 tag. Such names would be small integers (typically, less than 100) and so could easily be mapped by the monitored entity onto its internal representation. We will use the term "dictionary" to represent an internal node in the hierarchy. Here is a possible organization of the hierarchy in an entity that has several network interfaces and multiple processes. The exact organization of data in entities is specified in RFC-1024. Trewitt & Partridge [Page 3] RFC 1023 HEMS Language October 1987 system { name -- host name clock-msec -- msec since boot interfaces -- # of interfaces } interfaces { -- one per interface interface { type, ip-addr, in-pkts, out-pkts, . . . } interface { type, ip-addr, in-pkts, out-pkts, . . . } interface { type, ip-addr, in-pkts, out-pkts, . . . } : } processes { process { name, stack, interrupts, . . . } process { name, stack, interrupts, . . . } : } route-table { route-entry { dest, interface, nexthop, cost, . . . } route-entry { dest, interface, nexthop, cost, . . . } : } arp-table { arp-entry { hard-addr, ip-addr, age } arp-entry { hard-addr, ip-addr, age } : } memory { } The "name" of the clock in this entity would be: system{ clock-msec } and the name of a route-entry's IP address would be: route-table{ route-entry{ ip-addr } }. Actually, this is the name of the IP addresses of ALL of the routing table entries. This ambiguity is a problem in any situation where there are several instances of an item being monitored. If there was a meaningful index for such tabular data (e.g., "routing table entry #1"), there would be no problem. Unfortunately, there usually isn't such an index. The solution to this problem requires that the data be accessed on the basis of some of its content. More on this later. More than one piece of data can be named by a single ASN.1 object. The entire collection of system information is named by: system{ } and the name of a routing table's IP address and cost would be: route-table{ route-entry{ ip-addr, cost } }. Trewitt & Partridge [Page 4] RFC 1023 HEMS Language October 1987 Arrays There is one sub-type of a dictionary that is used as the basis for tables of objects with identical types. We call these dictionaries arrays. In the example above, the dictionaries for interfaces, processes, routing tables, and ARP tables are all arrays. In fact, we expect that most of the interesting data in an entity will be contained in arrays. The primary difference between arrays and plain dictionaries is that arrays may contain only one type of item, while dictionaries, in general, will contain many different types of items. Arrays are usually accessed associatively using special operators in the language. The fact that these objects are viewed externally as arrays does not mean that they are represented in an implementation as linear lists of objects. Any collection of same-typed objects is viewed as an array, even though it might be represented as, for example, a hash table. REPRESENTATION OF A REPLY The data returned to the monitoring entity is a sequence of ASN.1 data items. Each of these corresponds to one the top-level dictionaries maintained by the monitored entity. The tags for these data items will be in the "application-specific" class (e.g., if an entity has the above structure for its data, then the only top-level data items that will be returned will have tags corresponding to these groups). If a query returned data from two of these, the representation might look like: interfaces{ . . . } route-table{ . . . } which is just a stream of two ASN.1 objects (each of which may consist of many sub-objects). Data not in the root dictionary will have tags from the context- specific class. Therefore, data must always be fully qualified. For example, the name of the entity would always be returned encapsulated inside an ASN.1 object for "system". If it were not, there would be no way to tell if the object that was returned were "name" inside the "system" dictionary or "dest" inside the "interfaces" dictionary (assuming in this case that "name" and "dest" were assigned the same ASN.1 tag). Having fully-qualified data simplifies decoding of the data at the receiving end and allows the tags to be locally chosen (e.g., definitions for tags dealing with ARP tables can't conflict with definitions for tags dealing with interfaces). Therefore, the people Trewitt & Partridge [Page 5] RFC 1023 HEMS Language October 1987 doing the name assignments are less constrained. In addition, most of the identifiers will be fairly small integers. It will often be the case that requested data may not be available, either because the request was badly formed (asked for data that couldn't exist) or because the particular data item wasn't defined in a particular situation (time since last error, when there hasn't been an error). In this situation, the returned data item will have the same tag as in the request, but will have zero-length data. Therefore, there can NEVER be an "undefined data" error. This allows completely generic queries to be composed without regard to whether the data is defined at all of the entities that will receive the request. All of the available data will be returned, without generating errors that might otherwise terminate the processing of the query. REPRESENTATION OF A REQUEST A request to a monitored entity is also a sequence of ASN.1 data items. Each item will fit into one of the following categories: Template These are objects with the same types as the objects returned by a request. The difference is that a template only specifies the shape of the data -- there are no values contained in it. Templates are used to select specific data to be returned. No ordering of returned data is implied by the ordering in a template. A template may be either simple or structured, depending upon what data it is naming. The representations of the simple data items in a template all have a length of zero. Tag A tag is a special case of a template that is a simple (non-structured) type (i.e., it names exactly one node in the dictionary tree). Opcodes These objects tell the query interpreter to do something. They are described in detail later in this report. Opcodes are represented as an application-specific type whose value determines the operation. These values are defined in RFC-1024. Data These are the same objects that are used to represent information returned from an entity. It is occasionally be necessary to send data as Trewitt & Partridge [Page 6] RFC 1023 HEMS Language October 1987 part of a request. For example, when requesting information about the interface with IP address "10.0.0.51", the address would be sent in the same format in the request as it would be seen in a reply. Data, Tags, and Templates are usually in either the context-specific class, except for items in the root dictionary and a few special cases, which are in the application-specific class. QUERY LANGUAGE Although queries are formed in a flexible way using what we term a "language", this is not a programming language. There are operations that operate on data, but most other features of programming languages are not present. In particular: - Programs are not stored in the query processor. - The only form of temporary storage is a stack. In the current version of the query language: - There are no subroutines. - There are no control structures defined in the language. - There are no arithmetic or conditional operators. These features could be added to the language if needed. This language is designed with the goal of being expressive enough to write useful queries with, but to guarantee simplicity, both of query execution and language implementation. The central element of the language is the stack. It may contain templates, (and therefore tags), data, or dictionaries (and therefore arrays) from the entity being monitored. Initially, it contains one item, the root dictionary. The overall operation consists of reading ASN.1 objects from the input stream. All objects that aren't opcodes are pushed onto the stack as soon as they are read. Each opcode is executed immediately and may remove things from the stack and may generate ASN.1 objects and send them to the output stream. Note that portions of the response may be generated while the query is still being received. The following opcodes are defined in the language. This is a Trewitt & Partridge [Page 7] RFC 1023 HEMS Language October 1987 provisional list -- changes may need to be made to deal with additional needs. In the descriptions below, opcode names are in capital letters, preceded by the arguments used from the stack and followed by results left on the stack. For example: OP a b OP t means that the OP operator takes and off of the stack and leaves on the stack. Many of the operators below leave the first operand ( in this example) on the stack for future use. Here are the operators defined in the query language: GET dict template GET dict Emit an ASN.1 object with the same "shape" as the given template. Any items in the template that are not in (or its components) are represented as objects with a length of zero. This handles requests for data that isn't available, either because it isn't defined or because it doesn't apply in this situation. or dict GET dict If there is no template, get all of the items in the dictionary. This is equivalent to providing a template that lists all of the items in the dictionary. BEGIN dict1 tag BEGIN dict1 dict Pushes the value for dict{ tag } on the stack, which should be another dictionary. At the same time, produce the beginning octets of an ASN.1 object corresponding to that dictionary. It is up to the implementation to choose between using the "indefinite length" representation or going back and filling the length in later. END dict END -- Pop the dictionary off of the stack and terminate the currently open ASN.1 object. Must be paired with a BEGIN. Getting Items Based on Their Values One problem that has not been dealt with was alluded to earlier: When dealing with array data, how do you specify one or more entries based upon some value in the array entries? Consider the situation where there are several interfaces. The data might be organized as: Trewitt & Partridge [Page 8] RFC 1023 HEMS Language October 1987 interfaces { interface { type, ip-addr, in-pkts, out-pkts, ...} interface { type, ip-addr, in-pkts, out-pkts, ...} : : } If you only want information about one interface (perhaps because there is an enormous amount of data about each), then you have to have some way to name it. One possibility is to just number the interfaces and refer to the desired interface as: interfaces(3) for the third one. But this is probably not sufficient since interface numbers may change over time, perhaps from one reboot to the next. This method is not sufficient at all for arrays with many elements, such as processes, routing tables, etc. Large, changing arrays are probably the more common case, in fact. Because of the lack of utility of indexing in this context, there is no general mechanism in the language for indexing. A better scheme is to select objects based upon some value contained in them, such as the IP address or process name. The GET-MATCH operator provides this functionality in a fairly general way. GET-MATCH array value template GET-MATCH array should be a array (dictionary containing only one type of item). The first tag in and