Data flow | The Requirements Engineer

« Back to Glossary Index

I define data flow as a sequence of data items that move from a producer to a consumer. First, I think of data flow as data in motion. For example, a GPS signal moves from a satellite to a navigation function. Next, I show how that signal becomes a suggested route. Then, I show the route as input to another function. In short, data flow connects functions, systems, and actors.

I use data flow diagrams (DFDs) to model data flow. For instance, I map processes, data stores, data flows, and sources or sinks. Moreover, I represent processes as functions of the system. In addition, I show neighboring systems or actors as sources or sinks. Also, I depict data stores as data at rest, such as route parameters or traffic message logs. Furthermore, I label data flows clearly, for example “GPS signal” or “desired destination.”

I follow one fundamental rule. I show all input and all output data in the diagram. Therefore, I avoid missing causal dependencies. Because data flow specifies causal dependencies, I make sure each function has its required inputs. However, I do not use DFDs to express explicit sequence. Instead, I focus on availability of inputs. If I need to express order or timing, I supplement DFDs with state transition diagrams. For example, I model states and events to express sequence. Likewise, I combine DFDs with state machines when I need both data motion and control sequencing.

I adopt simple notation. First, I draw a circle or box for the system under development to show the system boundary. Then, I draw arrows to show data flows to and from neighboring systems. Next, I add processes inside the system to show functionality. Also, I add data stores to show information that stays stored for a period. Consequently, readers can trace how data moves and where it rests.

I also explain trade-offs. For example, some dynamic modeling approaches, such as Petri nets, combine data flow and control flow in one diagram. However, I avoid that when clarity matters. Instead, I keep data flow and control flow separate when diagrams grow complex. Thus, I reduce cognitive load. Meanwhile, I use the marionette metaphor to explain collaboration. Specifically, I compare DFD functions to puppet parts that can move independently. In contrast, I compare a state machine to the cross that links puppet strings. As a result, I show how control constrains data-driven functions.

I connect data flow modeling to other notations. For instance, I map context diagrams to SysML block diagrams and to use-case views. Moreover, I use DFD-based context diagrams to show system scope and terminators. In addition, I show how DFDs interface with information-structure models.

Finally, I give practical advice. First, I label data flows precisely. Second, I include all inputs and outputs. Third, I document assumptions about timing or ordering elsewhere. By doing so, I make data flow diagrams clear, useful, and easy to translate into other models.

« Back to Glossary Index