From a functional perspective, we divide I-DAG into three sub-components:
Label-based event streaming
Let IoT devices events be a sequence of error, backup and information messages with a representation as Ei, Bi and Ii, where each of the message belongs to sensory devices as Devicei in the distributed computing environment as shown in Fig. 3. At each time interval t, streams generated through a function fi holds an array of event messages G[1..(Ei,Bi,Ii)] with G[i]=fi. Therefore, when a new occurrence of event messages arrive, the function representation changes to G[i++] and the individual event message collection at each node could be represented as,
$$ G\left[i++\right]=G\left [\left(E_{i}, B_{i}, I_{i}\right) ++\right] $$
(1)
Where, G[i++] is a container managing multiple event messages arrival with x≥0.
In order to approximate the inner function elements of G[i++], implicit vectors such as x(E[1..n]), y(B[1..n]) and z(I[1..n]) are added into the stream instruction set with a proportion of (Ei,x)++, (Bi,y)++ and (Ii,z)++ and returns an output approximation as,
$$ Event_{m}=\sum_{i=1}^{n}E_{i}\ast B_{i} \ast I_{i} $$
(2)
Where, Eventm>0 and represents the container of processed heterogeneous event messages.

Lemma-1: \(SE_{o,p,q}=\sum _{n}^{i=1}\left \{ \left (e_{i}\times n_{o}\right), \left (b_{i}\times n_{p}\right), \left (i_{i}\times n_{q}\right)\right \}\)
The individual data segments of Ei, Bi and Ii arrives at nodes No, Np and Nq through an incremental function G[i++] that assembles segments in formation order. This order summarize stream segments in such a way that G[i++] stores SEo,p,q≤0.
Lemma-2: E[s]=PP(ei,bi,ii)
Since, \(SE_{o,p,q}= \sum _{n}^{i=1}\left \{ \left (E_{i}\times N_{o}\right), \left (B_{i}\times N_{p}\right), \left (I_{i}\times N_{q}\right)\right \}\), but, \(\sum _{n}^{i=1}\left \{ \left (E_{i}\times N_{o}\right)\times \left (B_{i}\times N_{p}\right)\times \left (I_{i}\times N_{q}\right)\right \}\neq N_{o,p,q}\times \left (\sum _{n}^{i=1}\left (E_{i},B_{i},I_{i}\right)\right)\). Therefore, the constraints are residing within the (Ei,Bi,Ii,). Moreover, if i=j=k then E[No,p,q]=E[1]=1 and if o≠p≠q, then E[No,p,q] are independent and could be retrieved as, \(E\left [N_{o,p,q}\right ]=\tfrac {1}{2}1+\tfrac {1}{2}\left (-1\right)\). After that, the linearity of expectation could be represented as,
$$ {\begin{aligned} E\left[SE_{o,p,q}\right]=E\left[\left(\sum_{i=1}^{n}\left(E_{i}, B_{i}, I_{i}\right)\right)\right]\left(\sum_{i=1}^{n}\left(N_{o}, N_{p}, N_{q}\right)\right) \end{aligned}} $$
(3)
$${\begin{aligned} =E\left[\sum_{o,p,q}^{n}\left(N_{o}, N_{p}, N_{q}\right)\left(E_{i},B_{i},I_{i}\right)\right] \end{aligned}} $$
$${\begin{aligned} &=\sum_{o}^{n}\left(N_{o}\right)E\left [E_{i},B_{i},I_{i}\right] +\sum_{o \neq p }^{n}\left(N_{p}\right)E\left [E_{i},B_{i},I_{i}\right]\\&\quad+\sum_{o \neq p \neq q}^{n}\left(N_{q}\right)E\left [E_{i},B_{i},I_{i}\right] \end{aligned}} $$
Where E[SEo,p,q] manages the heterogeneous events with independent expectation parameters.
Lemma-3: \(V\left [sE_{o,p,q}\right ]\leq 2E\left [sE_{o,p,q}^{}\right ]^{2}\)
Since,
$$V\left[SE_{o,p,q}\right]=E\left[\left(SE_{o,p,q}\right)\right]^{2}-E\left[SE_{o,p,q}\right]$$
$$=\left(\sum_{o,p}^{n}...N_{o}N_{p}\right)\times \left(\sum_{p,q}^{n}...N_{p}N_{q}\right) $$
$${\begin{aligned} &=\sum_{o,p,q}^{n}\left(...N_{o}N_{p}N_{q}\right)\leq 2\left(\sum_{o}^{n}E_{i},B_{i},I_{i}\right)\\&\quad\times \left(\sum_{p}^{n}E_{i},B_{i},I_{i}\right)\times \left(\sum_{q}^{n}E_{i},B_{i},I_{i}\right) \end{aligned}} $$
$$=2 E \left[SE_{o,p,q}\right]^{2} $$
Lemma-4: average T1 and T2 of SEo,p,q
Let A be the output of algorithm-1, so
$$E\left[S\right]=PP\left(E_{i}, B_{i}, I_{i}\right), V\left(A\right)\leq 2E\left [A\right]^{2} $$
and that equals to the,
$$\sigma \left(A\right)=\sqrt{V\left(A\right)}\leq \sqrt{2}E\left[A\right] $$
Therefore, the bound of stream segment could be obtained as,
$$PE\left[\left| A-E\left[A\right]\right|> \varepsilon E\left[A\right]\right] $$
Thus,
$${\begin{aligned} &PE\left[\left| A-E\left[A\right]\right|> \varepsilon E\left[A\right]\right]\\&\leq PE\left[\left| A-E\left[A\right]\right|> \sqrt{2} \varepsilon \sigma \left(A\right)\right] \end{aligned}} $$
In order to reduce the variance, we apply Chebyshev inequality [40] to \( \sqrt {2}\varepsilon > 1\), we get the output as,
$$E\left[A_{i}\right]=PP\left(E_{i}, B_{i}, I_{i}\right), V\left(A_{i}\right)\leq 2E\left [A_{i}\right]^{2} $$
So if B be the average of \(\phantom {\dot {i}\!}A_{i},...,A_{T_{1}T_{2}}\)
$$E\left[B\right]=PP\left(E_{i}, B_{i}, I_{i}\right), V\left(B\right)\leq \frac{2E\left [B\right]^{2}}{T_{1}T_{2}} $$
Now, by Chebyshev’s inequality, as \(T_{1}T_{2}\geq \frac {16}{\varepsilon ^{2} }\),
we get,
$$PE\left[\left| B-E\left[B\right]\right|> \varepsilon E\left[B\right]\right]\leq \frac{V\left(B\right)}{\left(\varepsilon H\left[B\right]\right)^{2}} $$
$$PE\left[\left| B-H\left[B\right]\right|> \varepsilon H\left[B\right]\right]\leq \frac{2H\left[B\right]^{2}}{\left(T_{1}T_{2}\varepsilon^{2}H\left[B\right]^{2}\right)}\leq \frac{1}{8} $$
At this point, streaming bound δ could be obtained but since a dependence of \(\frac {1}{\delta }\) is present, therefore, we apply lower bound inequality Hoeffding [41] on H[B]=PP(Ei,Bi,Ii) and get,
$$PE\left[\left(1-\varepsilon\right)H\left[B\right] \leq B\leq \left(1+\varepsilon\right)H\left[B\right]\right]\geq \frac{7}{8} $$
Now execute median function Z of T1T2 onto \(B,B_{1},...,B_{T_{1}T_{2}}\phantom {\dot {i}\!}\) and we get,
$$ PE\left[\left| Z-H\left[B\right]\right|\geq \varepsilon H\left[B\right]\right]\leq \delta $$
(4)
when,
$$T_{1}T_{2}\geq \frac{32}{9}ln\frac{2}{\delta } $$
The stream approximation could be obtained as,
$$ \left(E_{i}\right)_{N_{o}}=O\left(\frac{1}{\varepsilon^{2}}ln\frac{2}{\delta}\right)_{HN_{i,i}} $$
(5)
$$ \left(B_{i}\right)_{N_{p}}=O\left(\frac{1}{\varepsilon^{2}}ln\frac{2}{\delta}\right)_{BN_{o,p}} $$
(6)
$$ \left(I_{i}\right)_{N_{q}}=O\left(\frac{1}{\varepsilon^{2}}ln\frac{2}{\delta}\right)_{IN_{i,k}} $$
(7)
This stream approximation defines the existence of managing heterogeneous parameters in the I-DAG.
Heterogeneous stream transformation
The distributed stream elements with probability α(t) are sampled at time t with a computing average of,
$$\alpha \left(t\right)=\alpha,contant: error\simeq \frac{1}{\sqrt{\alpha \times t}}\rightarrow 0 $$
and,
$$\alpha \left(t\right)\simeq \frac{1}{\varepsilon^{2}\times t}:error\simeq \varepsilon, constant\ over\ time $$
In order to perform encapsulation, reservoir sampling is used because it allows adding first k stream elements to the sample having total items t−th with probability \(\frac {k}{t}\). Thus, for every t and i≤t, the sample probability is evaluated as,
$$ P_{i,t}=PE\left[s_{i}\ in\ sample\ at\ time\ t\right]=\frac{k}{t} $$
(8)
and for t+1, the sample probability becomes,
$$P_{t+1,t+1}=PE\left[s_{t+1}\ sampled\right]=\frac{k}{t+1} $$
This is mandatory because of the inter-connected heterogeneous IoT tuples that are to be incorporated with the internal of time. The processing of t+1 with i≤t eventually reduces the role of si and returns st+1 as,
$$ P_{i,t+1}=\frac{k}{t}\times \left(1-\frac{k}{t+1}\times \frac{1}{k}\right) $$
(9)
$$=\frac{k}{t}\times \left(1-\frac{1}{t+1}\right) $$
$$=\frac{k}{t}\times \frac{t}{t+1}=\frac{k}{t+1} $$
The frequency table of stream events uses the event arrival probability Pi,t+1 into Like space saving of count-min sketch to bring an order between transformed heterogeneous stream events as shown in Fig-3. This space saving function provides an approximation fx′ to fx for every x and consumes memory equals to \(O\left (\frac {1}{\Theta }\right)\). Therefore, when a stream vector G[n] is processed with G[i]≥0 for ∀i ε t, it estimates heterogeneous stream G′ of G as,
$$G\left[i\right]\leq G'\left[i\right]\ \ \ \forall i $$
and,
$$G'\left[i\right]\leq G\left[i\right]+\varepsilon \left| G\right|{~}_{1}\ \ \ \forall i, with\ probability\ \geq 1-\delta $$
Where, \(\left | G\right |{~}_{1} =\sum _{i} G\left [i\right ]\) and |G| 1≪streamlength having \(O\left (\frac {1}{\varepsilon ^{2}}ln\frac {2}{\delta }\right)_{HN_{i,i}}\), \(O\left (\frac {1}{\varepsilon ^{2}}ln\frac {2}{\delta }\right)_{BN_{o,p}}\) and \(O\left (\frac {1}{\varepsilon ^{2}}ln\frac {2}{\delta }\right)_{IN_{i,k}}\) memory with \(O\left (ln\frac {n}{\delta }\right)\) update time t.
The heterogeneous events stream \(\sum _{i} G\left [i\right ]\) consists of d independent hash functions h1...hd:[1..n]→[1..w] where, each of the stream element holds memory gp(i) that uses instruction set G[i]+=(Ei,Bi,Ii) having gp(i)+=(Ei,Bi,Ii) for ∀ jε 1..d and the frequency table of heterogeneous events stream could be retrieved as,
$$ G'\left[i\right]=min\left \{ g_{p}\left(i\right)|j=1..d \right \} $$
(10)
This declares that the accessibility of the heterogeneous events stream in enlisted in the I-DAG.
Lemma-5: G′[i]≥g[i]
The minimum count of heterogeneous events stream G[i] remains ≥ 0 for ∀ i with a frequency of update(gp(i)). The stream element having hash function Io,p,q=1 if gp(i)=gp(k)=0 could be retrieved as,
$$ H\left[I_{o,p,q}\right]\leq \frac{1}{range\left(g_{p}\right)}=\frac{1}{w} $$
(11)
By definition \(A_{o,p}=\sum _{k}H\left [I_{o,p,q}\right ]\times G\left [k\right ]\), the heterogeneous events stream can be represented as,
$$ A_{o,p}=\sum_{k}H\left[I_{o,p,q}\right]\times G\left[k\right]\leq \frac{\left| G\right|{~}_{1}}{w} $$
(12)
Now, this stream is well connected and could not be ready independently. Therefore, we apply Markov inequality and pairwise independence as,
$$ PE\left[A_{o,p} \geq \varepsilon \left| G\right|{~}_{1}\right]\leq \frac{H\left[A_{o,p}\right]}{\varepsilon \left| G\right|{~}_{1}}\leq \frac{\left(\frac{\left| G\right|{~}_{1}}{w}\right)}{\left(\varepsilon \left| G\right|{~}_{1}\right)}\leq \frac{1}{2} $$
(13)
if \(w=\frac {2}{\varepsilon }\) then,
$$PE\left[G'\left[i\right]\geq G\left[i\right]+\varepsilon \left| G\right|{~}_{1}\right] $$
$$=PE\left[\forall\ j \ :G\left[i\right] +A_{o,p}\geq G\left[i\right]+\varepsilon \left| G\right|{~}_{1}\right] $$
$$ =PE\left[\forall\ j \ : A_{o,p}\geq \varepsilon \left| G\right|{~}_{1}\right] \leq \left(\frac{1}{2}\right)^{d}=\delta $$
(14)
$$if \ d=log\left(\frac{1}{\delta}\right) $$
for fixed value of i as shown in Figs. 4 and 5. Thus, we observe that the events are synchronized to a central container with independence of accessibility.
I-DAG workflow
The events generated through IoT devices with a sequential order of PE[∀ j :Ao,p≥ε|G| 1]are scheduled onto the I-DAG that consists of an identifier LocatorI−DAG which reads events labels \(\left (E_{i}\right)_{N_{o}}=O\left (\frac {1}{\varepsilon ^{2}}ln\frac {2}{\delta }\right)_{HN_{i,i}}\), \(\left (B_{i}\right)_{N_{p}}=O\left (\frac {1}{\varepsilon ^{2}}ln\frac {2}{\delta }\right)_{BN_{o,p}}\) and \(\left (I_{i}\right)_{N_{q}}=O\left (\frac {1}{\varepsilon ^{2}}ln\frac {2}{\delta }\right)_{IN_{i,k}}\) in the source file and shuffle the pointer between n stages as shown in Fig. 6.
In order to perform stage predictor evaluation, the workflow targets PE[∀ j :Ao,p≥ε|G| 1]: stage(n)→stage(n+1) with LocatorI−DAG: stage(n)→stage(n+1) keeping the error under loss function \(\vartheta :stage\left (n+1\right)\times stage\left (n+1\right)\rightarrow \mathbb {R}\). The predictor error can be obtained as,
$$ {\begin{aligned} {}& H_{stage\left(n\right)}\left[\vartheta\left(P\left[\forall\ j \ : A_{o,p}\geq \varepsilon \left| G\right|{~}_{1}\right]\right.\right. \\& \left.\left.\left(stage\left(n\right)\right),Locator_{I-DAG}\right)\right] \end{aligned}} $$
(15)
This predictor error manages the discrepancies of inter-connection in the I-DAG workflow.
The LocatorI−DAGwith an approximated finite heterogeneous event labels can be sampled with SI−DAG=((stage(n)1,stage(n+1)1),...,(stage(n)n,stage(n + 1)n)) through \(\frac {1}{n}\sum _{i=1}^{n}\vartheta \left (P\left [ \forall \ j \ : A_{o,p}\geq \varepsilon \left | G\right |{~}_{1}\right ], stage\left (n+1\right)\right)\). The workflow loss function are categorized into two types: (i) regression and (ii) classification. The regression loss on predictor LocatorI−DAG is expressed as,
$$ \vartheta\left(a,b\right)=\left(a-b\right)^{2} $$
(16)
and classification loss on predictor LocatorI−DAG is expressed as,
$$ \vartheta\left(a,b\right)=0\ if\ a=b, 1\ otherwise $$
(17)
Thus, I-DAG is ready to facilitate the independent heterogeneous IoT entries with prediction locator.