Strand: scalable trilateration with Node.js

Tserpes, Konstantinos; Pateraki, Maria; Varlamis, Iraklis

doi:10.1186/s13677-019-0142-y

Software
Open access
Published: 12 November 2019

Strand: scalable trilateration with Node.js

Journal of Cloud Computing volume 8, Article number: 16 (2019) Cite this article

6251 Accesses
3 Citations
2 Altmetric
Metrics details

Abstract

This work reports on the development details and results of an experimental setup for the localization of the attendants of a music festival. The application had to be reporting in real-time the asymmetric crowd density based on the Received Signal Strength Indicator (RSSI) between the attendants’ smartphones and an experimental installation of 24 WiFi access points. The impermanent nature of the application led to the implementation of a cloud-based solution, called “STRAND”. STRAND is based on Node.js components, which communicate through websockets, collect, process and exchange data and continuously report the produced information to the end-user. To cope with the near real-time requirements, and the volatility of the crowd concentration density, STRAND horizontally scales the trilateration component, i.e. the component that estimates the user location based on distance measurements. STRAND was tested during the festival days in July 2018 and the results show a system that copes with very high loads and achieves the temporal and accuracy requirements the were set.

Introduction

A music festival in Germany concentrates dozens of thousands of visitors within a mid-July weekend every year. At peak times it reaches almost 150,000 visitors. The organizers would like to have a bird’s eye view of the concentration of the visitors in the area of the festival, so as to identify situations of emergency and to swiftly adapt their risk mitigation plans such as the evacuation plans of the area. This resulted to the need for a heatmap visualization that depicts the crowd concentration density, in the map, in near-real time (i.e. at 1 min intervals). This implies that once the position information is collected, by an installation of sensors, it has to be processed and reported to the organizers’ dashboard within 1 min.

During the festival days, the large crowd is concentrated in a relatively confined space. This concentration leads to a high contention ratio of the effective mobile telecommunication bandwidth practically rendering it useless. The automatic detection of crowd concentration in near-real time is a critical yet challenging task for the organizers. Using image analysis techniques for this task, suffers from lack of computational resources to handle the increased complexity and results in an increased installation cost. Also, such methods do not perform well in all conditions (e.g. in low light) and can hardly avoid counting the same person multiple times. Techniques that transmit the GPS signal of users’ devices via a smartphone app are energy and bandwidth consuming. As a result, the use of the festival official mobile app for collecting position data from the devices’ GPS receiver is not an option since the app will have to compete for the limited bandwidth with popular mobile apps such as Instagram, Facebook, Twitter, etc.

The restrictions of visual techniques in such environment and the high energy consumption of embedded GPS sensors in smartphones and the frequent loss or erroneous GPS signal due to unavoidable obstructions, i.e. trees, buildings, etc., leave space for passive techniques, which take advantage of the pervasive availability of WiFi infrastructure and allow effective localisation and crowd concentration estimation in outdoor as well as in indoor scenarios.

In order to exert the WiFi-based outdoor localization we deployed a number of Raspberry Pi devices which are configured to operate as open WiFi access points (APs). The mobile devices of the visitors, with their WiFi transceiver activated, would poll continuously for the networks in their vicinity, exchanging some basic information with the WiFi open access points. The received signal strength indicator (RSSI) could be then used to estimate the distance of the users from multiple access points and then, by a process known as "trilateration", to estimate the position of those users. This is an approach commonly employed in the literature mainly in indoor localization setups.

From a non-functional point of view, the application needed to be scalable, i.e to be able to simultaneously locate a large number of users, maintaining the near real time (1 min) requirement and ensuring that the cost in resources based on the asymmetric crowd’s density fluctuates proportionally to the crowd’s volume at various times.

Also, the application should provide adequate guarantees regarding the users’ location privacy. To deal with the first issue it was necessary to employ cloud resources and the cloud providers’ API that would enable the automatic scaling. For the issue of privacy, as the user location can be correlated with far more personal information related to behavior, mood etc., appropriate encryption was adopted to preserve and guarantee the user location privacy and anonymity, namely the MAC address of the users’ mobile devices was obfuscated using pseudoanonymization.

A third requirement was related to the accuracy of the location estimation, considering minimum cost for infrastructure and low number of nodes to obtain an estimate on user location, to avoid delays and performance degradation due to packet collision or wrong measurements [43], an error of 12 meters in terms of 2D accuracy was permitted by the festival organizers. This requirement is also tightly linked to the previous as this relaxed accuracy threshold also served as an artificial way to obfuscate the actual users’ positions. Finally, in parallel to the low cost aspect of the application, vendor locking-in should be avoided, as the festival is essentially free and it is relying on volunteers without much expertise on the field of application development.

The experimental setup comprised of 24 operational Raspberry PI devices covering critical areas of the festival grounds, including the main entrance/exit, the three main stages, the food and drink stand and the toilets. The devices were deployed in clusters of 3 or 4 with trees, infrastructure and festival facilities often obstructing their line of sight with the attendants. The measurements were collected in an on-site system and then distributed through a satellite link to the cloud-based application pipeline. The federation between the edge nodes, ie. the Raspberry PI devices, and the cloud was orchestrated by a cloud brokerage platform, called BASMATI ([2, 3]).

The research objectives were to implement the cloud-based application that would support the processing of the measurements and provide a near real-time (1 min intervals) heatmap of the crowd’s concentration density. The low-cost requirement for the application as well as its impermanent nature justified the use of cloud resources and the satellite link instead of an investment in permanent resources. The implementation was based on Micro-Service Architecture inspired by a pattern introduced in [36].

The main contributions of this work are:

an open source software reference implementation,
a cost-efficient, easy-to-deploy, large-scale localization system,
a micro-service architecture pattern that allows efficient load balancing on the cloud.

The implementation details of STRAND and the experience from its use during the festival days (July 19-22, 2018), are described in what follows: “Related work” section provides an analysis of the state of the art that justified the selection of the individual decisions in the application implementation; “Implementation” section gives the implementation details including the application architecture, the components’ functionality and interfaces, the implementation technology decisions and the key-design characteristics. The evaluation results are presented in “Evaluation results” section and in Section Conclusions the major conclusions of this work.

Related work

A thorough analysis of the current state-of-the-art in two main distinct domains was conducted prior to the implementation of the application: the localization systems and the cloud load balancing techniques. The selection of tools, technologies and approaches was based on the particular requirements of the application. This analysis is presented in what follows.

Passive sensing localization techniques

The use of image analysis techniques for the estimation of crowd density and person localisation has been proposed in the literature. For example, road traffic detection systems are using image segmentation and analysis techniques for processing the signal from road cameras [5], for the detection and counting of vehicles. In the case of crowd counting, Crowdnet [6] is a deep CNN trained to analyse video streams, whereas Convolutional LSTMs have been proposed in [41] for creating crowd density maps. An interesting survey on the use of CNN for single-image crowd counting can be found in [32]. The advantage of image analysis methods is that they are passive and device-free, since they can track any person in sight, without requiring smartphones or any other device. Their main disadvantage is that they are computationally heavier, thus they require more processing power. In addition, they are not applicable in all lighting conditions, unless thermal cameras or infrared cameras or combinations are used, which also increases the cost of the installation. Finally, visual solutions may have to tackle the problem of blind spots. In the case of a festival, this requires multiple cameras to increase coverage and careful position in order to avoid double counting. The latter is avoided by device-based methods that employ the device identify for disambiguation.

A large body of works deal with the problem of localization of "blind" nodes by passively sensing WiFi, Bluetooth or RFID signals. The methods are either device-free [10, 14] or device-dependent [40]. In the former case, a set of transmitters and receivers operates in a constant basis and human presence is detected based on changes in the strength of the received signals. In the latter case, users are carrying devices, which transmit WiFi or similar signal and a set of area sensors continuously collect the signal and analyse its strength. In such cases, the signal strength is used to detect the distance from each fixed sensor node, and algorithmic approaches, such as trilateration or N-point lateration, are employed for the estimation of the position of the moving signal source.

In fact, the concept appears in multiple works, both in indoor and outdoor localization with the former comprising the bulk of the works in the literature (e.g. [7, 18, 29, 31, 43]). When it comes to outdoor localization, there is a range of approaches being used so as to determine the location of a node in question. The majority of them are relying in measuring the distance of the blind node from a number of fixed-point (anchor) nodes that are part of the same Wireless Sensor Network (WSN) and then employing algorithms to estimate the node’s location. Some common distance measurement methods are angle of arrival (AoA) (e.g. [39]), time of arrival (ToA) (e.g. [30]), time difference of arrival (TDoA) (e.g. [42]), acoustic energy (e.g. [38], time of flight (TOF) (e.g. [16]) and received signal strength indicator (RSSI) (e.g. [13, 34, 35]). The first three methods require complex hardware set up, while TOF needs line of sight to effectively locate nodes. On the other hand, RSSI-based solutions are easy to implement and cost efficient but less accurate due to additional signal attenuation resulting from transmission through big obstacles and severe RSS fluctuation due to multipath fading ([9, 24]).

Beyond this range-based technique, other solutions have been presented for outdoor localization such as topographical maps and propagation-prediction tools, as well as statistical modeling, neural networks and particle filters ([23]).

Once the distance of the node in question is known from at least 3 known anchor nodes, the problem is reduced to an overdetermined system. Assuming a linear state space, the system is comprised of at least 3 second-degree polynomials expressing the euclidean distances of the node in question from the anchor nodes. By subtracting them, the polynomial degree is reduced and thus the system is solvable with standard algebraic solutions such as a least squares approximation. This approach was followed by [25] and it seems to be appropriate in small distances and near perfect input. In the case of the festival, the distances are large and the natural environment affect the RSSI. Furthermore, if the number of equations increases, i.e. more than 3 anchor points report their inaccurate RSSI with the node in question, the linear least squares solution complexity increases. The alternative is to use a nonlinear least squares fitting approach such as the Levenberg-Marquardt curve-fitting algorithm ([12]). The latter is appropriate for real-time operations due to its low complexity, however, the accuracy degrades when the measuring node’s speed increases([26]).

This work employs an RSSI-based trilateration localization algorithm to accurately localize the festival attendants’ smartphones. To deal with the near real-time requirements, the Levenberg-Marquardt curve-fitting algorithm was used for the trilateration part, considering the fact that the monitored crowd moves typically in very low speeds.

Cloud computing load balancing

Load balancing is a critical component provided by every public cloud service provider as it allows the application to adapt to load demands dynamically. scaleout and scalein operations are typically employed, allowing the horizontal scaling of the application, i.e. adding new cloud resources (scaleout) and removing them (scalein) at runtime so as to cope with the load. On the other hand, vertical scaling, i.e. adding more "power" to the existing cloud resources for load balancing purposes, is generally more rare (e.g. [33]) due to the high overhead and hard constraints involved in vertical scaling (usually the machines have to be reset). Load balancing in IaaS environments, implies that an application can scale through the addition or removal of Virtual Machines (VM). This is a standard practice in many applications, including distributed databases [20].

There are various taxonomies for load balancing strategies in the literature (e.g. [1, 22]). Perhaps the most relevant classifications for STRAND relate to the distinction of application-agnostic VS application-oriented and dynamic Vs static load balancing strategies.

Practically all major cloud providers are offering off-the-shelf IaaS solutions for load balancing ^{Footnote 1}^{Footnote 2}. In order for them to maintain a general purpose nature, they are made in a stateful way, i.e. they operate independently of the application characteristics. This is commonly referred to as "load balancing as a service" ([8, 27]). As such, the load balancing strategy is built on the basis of an infrastructure-related metric, such as the resource utilization of the processors, or the incoming traffic (requests per second-RPS). These strategies are part of what is referred to as internal and/or HTTP load balancing. There is the option to distribute instances from a regional managed instance group, based on custom-made metrics using external tools such as Google’s Stackdriver^{Footnote 3} but this usually comes at an extra cost.

However, in other cases, the load is defined in terms of application-related metrics, as in distributed data management systems, where the load is dependent on the amount of load/store operations ([21]). In these cases, the Load Balancer must be integrated in the application and be able to invoke the public cloud providers’ API to manage VMs. This approach also comes at the cost that apart from the deployment of the VM, the Load Balancer needs to instruct the public cloud’s API to install and run the application. This can be done through a startup script that installs and resolves all necessary dependencies and runs the processor’s code or deploys docker files.

This latter approach was selected in the case of this application in order to remove the burden of knowing the public clouds’ concepts from the future application developer. When someone will have to re-run the application, perhaps in a different cloud provider, the idea is to stick to basic devops (expressed in the startup script), which most likely won’t change in the years to come.

In terms of the load balancing strategy, a dynamic approach was followed as opposed to static. In static load balancing, a fixed number of operational processors (or remote nodes) is reserved and the systems use them accordingly. In the case of STRAND, the application scaled in and out based on some predetermined rules adapting to runtime conditions, and in particular the load itself. In the literature, there is a great number of dynamic load balancing approaches in cloud computing (e.g. [11, 15, 28]) that apply sophisticated mechanisms to historical data so as to predict and cope future loads. In the investigated case, these solutions were not applicable due to the lack of previous knowledge and due to the fact that the systems in the literature are not investigating mobility patterns which were relevant in this case, but different parameters such as e-commerce consumer habits.

Among the two controlling forms in dynamic load balancing algorithms, namely centralized and distributed, STRAND opted for the centralized. In centralized load distribution, a single node in the network is nominated to be responsible for all load distribution in the network. In the distributed approach, many nodes are undertaking the responsibility of sharing the load ([17]). A distributed solution like [19] would infuse unnecessary complexity with multiple cloud components needing to be configured to talk to one another.

Implementation

The processing pipeline

The key concept in this work was to implement a workflow of standalone components that will process the raw measurements as data streams and will deliver the data for the heatmap visualization. As such, the architecture of the application involves a logical pipeline of a number of components, namely, the data Collectors, the data Aggregator, the Clusterer, the Load Balancer, the Trilaterator (Processor), the Storage and the Frontend (Fig. 1). These are described in detail in what follows.

The first two components in the pipeline are deployed locally on site, as part of a local network and communicating trough standard TCP/IP protocols. The rest of the components are deployed on virtual machines (VM), on cloud resources communicating through websockets. The focus of this paper is to report on the details of the cloud-based part of the application, however for reasons of integrity, the on-site part is also presented.

According to this architecture, the raw WiFi signal measurements transmitted by every smartphone in proximity are collected by the Raspberry Pi, using the nexmon driver (https://github.com/seemoo-lab/nexmon) and a shell script, which was written for this purpose. Similar scripts are widely available online ^{Footnote 4}. Each measurement contains a timestamp and the id of the device that transmitted the signal. Data from the Raspberry Pis, which are the local data collectors, are aggregated and anonymized by a physical machine onsite, which stores the data locally and provides an API for accessing them. A cloud-based filtering module is responsible for retrieving data at regular intervals from the onsite machine, aggregating them per uid and sending them to the cloud-based load balancer. The latter is responsible for retrieving data and re-distributing them to micro-services hosted in cloud-based virtual machines for further processing, storage and display. The details for these operations are provided in what follows.

Component functionality

For a better understanding of the role of every component involved in the pipeline, this Section provides a description of their functionality starting from the first item in the pipeline, i.e. the Collectors.

Collectors

The data Collectors are executing the task of sensing for WiFi adapters in their vicinity in fixed time intervals and maintaining a record of their findings locally, in files. They do not perform any other task given that a) their computational capacity is limited, b) their main task of sensing is rather frequent (once every 15”) and c) the fact that they are exposed to adverse weather conditions (outdoors under direct sunlight or rain). The Collectors are deployed in strategically selected positions so as to cover areas of interest and to ensure that some level of overlap exists between their area of coverage (see Fig. 2).

Sensing: Operating as WiFi access points with the pretense that they are offering an open WiFi access, the Collectors implement a standard communication protocol with devices (mainly mobile) that have their WiFi transceivers on. Through this process, they are registering 4 main data items for each connected device:

uid: a unique user id that is in the form of a MAC address, assuming that the connecting, mobile devices are always transmitting the same MAC address.
did: the device id of the Collector device that collected the data. This field is needed to map the device that collected the signal to a specific location (Lat, Lon pair).
rssi: the received signal strength indicator in dB. This is needed to estimate the radius upon which the user was detected with the centre being the collector’s device location.
timestamp: unix time during which the data were generated

A unique uid is needed so as to avoid to depict the same user many times in the heatmap, affecting its accuracy. However, in some cases, namely in some iOS versions and a few Android devices, the transmitted MAC address is randomly generated by the device itself. This can’t be tackled in a systematic way and it is considered as a system error. Notwithstanding the statistical value of this error is negligible as a) it has been observed that in the vast majority of the cases during the festival, the mobile devices are transmitting their actual MAC address and b) we are interested in the number of unique visitors during a limited timespan, a fact which significantly reduces the possibility that a user whose device is in fact ”fakes” its MAC addresses will be taken into consideration more than once within this timespan.

Storing: Each Collector is polling its surroundings for mobile devices in a fixed time interval and writes out the collected information to a locally stored file. One timestamped file is created for each round of measurements in order to mitigate potential synchronization issues.

Aggregator

The data Aggregator is providing four basic functionalities: data aggregation, initial data filtering, pseudoanonymization and data access provision to other components. It is an onsite component running on a physical machine which is part of the Collectors’ network. This machine has access to the internet, thus it bridges the local Collectors’ network to the application cloud resources. It persists the data from the files it collects, organizes them in a relational database and emits them to interested parties.

Data aggregation: The Aggregator, being part of the Collectors’ network, can access their storage and pull the measurement files at fixed intervals. It does so by connecting to each one of them through SSH and executing a shell script to retrieve the latest data files that it hasn’t retrieved yet. In this way the need for hard synchronization constraints is lifted with the Aggregator and each of the Collectors executing their operations at independent times and possibly at a different frequency.

Filtering: Occasionally there are reasons to discard certain measurements, e.g. same data are received twice, or the signals are too weak to make any sense. This component is filtering out measurements that are problematic. Also, there are many cases in which the signals are received by external to the application fixed WiFi access points which are consistently trying to register to the open WiFi network. A lookup table can be used here to allow the Aggregator to filter out MAC addresses that belong to such network devices manufacturers or even MAC addresses that persisting measurements’ values.

Pseudoanonymization: To protect user privacy, the Aggregator uses a fixed hash function to transform the uid value from a MAC address to a new, seemingly random, device id. The hash function is deterministic which means that the same input will always result in the same output because it is important to distinguish unique users.

Data access: The Aggregator stores the preprocessed measurements in a db and provides a RESTful API for other components to access it.

Filtering

The first cloud component provides a higher-order function that processes the input data structure so as to bring it in a form that is possible to find all the data necessary for trilateration within a single object. Its main functionality is to filter the input data structure that it receives from the Aggregator.

Filtering: This functionality clusters messages based on their uid and timestamp, i.e. groups measurements with the same MAC and a timestamp indicating that they were generated within a 1 min time window. The clustering is essentially a two-step process: a) group by uid and b) merge measurements under a single uid and a single timestamp, in particular the last received timestap in the time window. The resulting data structure is comprised of records containing the uid, the timestamp and an array of RSSI values coupled by the Collector’s device id (did) that collected each value, i.e. an array of did, rssi pairs.

Load balancer

The Load Balancer buffers the measurements received by the Aggregator in an internal queue and distributes the load to all connected processor components (Trilaterators). Judging from the queue length and the rate that it is evolving, the Load Balancer can request the deployment of more processor components or decommissioning of the excess ones. Assuming the existence of the appropriate resources, the Load Balancer provides temporal constraint guarantees and it is a central component in the implementation of the scalable application.

Load distribution: This functionality allows to processor components (Trilaterators) to request and receive measurements to process. The Load Balancer removes the oldest pending object of measurements in the queue for a uid and pushes them to the requesting processor.

Scaling: Simple, demand-based autoscaling rules are adopted. The rationale is that the Load Balancer can identify the need for a new processor component to be added by monitoring the length of its queue periodically. If the rate at which the queue size is increasing exceeds a certain threshold, then the Load Balancer proactively requests the deployment of new processor components. Conversely, if the rate is decreasing, the Load Balancer requests the decommission of the excess processor(s) components. These requests are implemented through the underlying cloud providers’ API (the one to which the processor is deployed). The time that it takes for a new processor to be fully operational and the fluctuating number of measurements entering the application both play a detrimental role in the scaling operations, imposing a large number of constraints and exceptions. As such upper and lower thresholds are provided for the queue length. Exceeding those thresholds result in scaling commands to be issued. To avoid new scaling commands to be issued before the previous are completed a cooldown period is provided. Furthermore, the number of VMs that can be spawned at any time is also limited by a value set manually. Equivalently, a warm-up period is considered, so as to be able to count the generated instances by increasing them by one, when the instance actually starts operating. Details on how the various thresholds are selected are discussed in detail in “Evaluation results” section.

Trilaterator

The Trilaterator is undertaking the task of processing measurements that are assigned to it by the Load Balancer and finding the approximate location of the respective user at the given timespan. It is deployed on Virtual Machines and cold or hot deployment is employed based on the end users assigned budget. The particular processing that is done is called trilateration and it requests at least three distances of the object in question from an equivalent number of known points, in order to estimate its position. Thus, before the trilateration task, the Trilaterator is calculating the distance of the user in question from the Collector device by means of the RSSI.

Distance calculation: The distance (dis) is calculated based on the RSSI and the location of the Collector that measured this signal strength. The formula is:

$$dis=10^{(P_{tx}-rssi/10*plex)}$$

where P_tx is the transmit power of the Collector device and plex is the path-loss exponent. The context is that the transmitter and the receiver are rarely placed within a line of sight and that the signal propagation model needs to consider the environment within the transmission takes place as well as the power it is transmitted [25]. This general purpose formula considers these two parameters. The default transmit power for DD-WRT based routers is between 70mW or 18.5dBm. Also, as a rule of thumb the path-loss exponent receives the following values: 2 for free space, 2.7 to 3.5 for urban areas, 3.0 to 5.0 in suburban areas and 1.6 to 1.8 for indoors when there is line of sight to the router. For the particular experiment, it was set to 2.5.

Trilateration: Since the Collector’s devices are in fixed points, we can employ a simple lookup table and the did of the Collector’s device to get the approximate location of the Collectors in GPS coordinates. Knowing the distances from n(n>2) fixed points allows us to estimate the user location in the form of lat,lon coordinates.

In a perfect situation the rationale behind trilateration would be the following: For each user i and Collector device j there is a pair loc_j,dis_ij that defines a sphere of radius dis_ij around location loc_j and the user may lie in any point of this sphere. Two such pairs may limit the user location in any point of the circle that is created by the intersecting spheres. Three pairs further limit the possible user location to the two points defined by the intersection of the three spheres. Since we only need to identify the lat, lon pair of the user, the altitude can be disregarded. Therefore, the earth operates as a fourth sphere further limiting the points to a single one.

In reality, there are multiple errors that are infused by various systems along the way, including errors in measuring the RSSI, inaccurate GPS coordinates of the Collectors, erroneous calculation of the distance, etc. Therefore, the spheres may not always intersect, or the intersection may commonly lead to an area rather than a point (Fig. 3).

Selecting any point from the intersection area will result in a declination from the expected values for at least two of the measurements, i.e. the selected point will not be on at least two of the circles. The key point is to identify the lat, lon pair for the user within the intersection area that minimizes this declination. A solution for this problem is to find the user position that will minimize the sum of squares of this declination. This method is called least square and it is commonly used in data fitting problems like the one at hand. In fact, we want to identify the function f which maps the user location (Y_i) to the {loc,dis} pair (X_i) of the i−th Collector from a set of m, or in other words Y_i=f(X_i),i=1…m. The evaluation of this function for every Collector will give a residual r_i, therefore the point is to $minimize\left (\sum _{i=1}^{m}r_{i}^{2}\right)$.

This problem is solved using the Levenberg-Marquardt algorithm as it is implemented in the trilat package (https://www.npmjs.com/package/trilat).

Storage

The Storage component provides a RESTful API enabling other components to persist user locations to its database and at the same time to access the data.

Persisting: An API allows a client to "push" tuples of the form of uid, lat, lon, timestamp to the database. Accessing: An API allows a client to "pull" tuples of the abovementioned format in a stateful, synchronized way, allowing the querying of the database using multiple criteria.

Frontend

The Frontend is an HTML-based web client that retrieves data from the Storage and depicts it in heatmap visualization in real time.

Data retrieval: A timed AJAX request is using the Storage API to retrieve data.

Visualization: Using the Leaflet Javascript library (https://leafletjs.com), the Leaflet.heat extension (https://github.com/Leaflet/Leaflet.heat) and Openstreetmap (https://www.openstreetmap.org/), the Frontend chromatically depicts the concentration density of users in various locations of the festival area within consequent timespans.

Interfaces and data formats

The basic mode of communication between the cloud-based components is implemented through websockets. Each component maintains a websocket interface for its counterpart for the purpose of data exchange. Depending on the situation the data may be “pushed” or “pulled” (sent after request) from one component to another.

The following subsections provide an overview of the inputs/outputs of the components in terms of data exchange.

Collector

As already stated, the Collectors do not expose any interface, rather they allow predefined components to pull their data through SSH.

Input: WiFi signals

Output: raw data files, timestamped. An example measurement from one Collector looks like this: