 Research
 Open Access
 Published:
Privacyaware and Efficient Student Clustering for Sport Training with Hash in Cloud Environment
Journal of Cloud Computing volume 11, Article number: 52 (2022)
Abstract
With the wide adoption of health and sport concepts in human society, how to effectively analyze the personalized sports preferences of students based on past sports training records has become a crucial and emergent task with positive research significance. However, the past sports training records of students are often accumulated with time and stored in a central cloud platform and therefore, the data volume is too large to be processed with quick response. In addition, the past sports training records of students often contain certain sensitive information, which probably discloses partial user privacy if we cannot protect the data well. Considering these two challenges, a privacyaware and efficient student clustering approach, named PESC is proposed, which is based on a hash technique and deployed on a central cloud platform connecting multiple local servers. Concretely, in the cloud platform, each student is firstly assigned an index based on the past sports training records stored in a local server, through a uniform hash mapping operation. Then similar students are clustered and registered in the cloud platform based on the students’ respective sport indexes. At last, we infer the personalized sport preferences of each student based on their belonged clusters. To prove the feasibility of PESC, we provide a case study and a set of experiments deployed on a timeaware dataset.
Introduction
With the wide spread of COVID19 all over the world, people are more focusing on health and life than ever before [1,2,3]. In this situation, health and sport concepts have gained wider adoption in whole human society than ever before [4,5,6]. To achieve the goal of healthy living and life, sport courses have been playing an increasingly important role in the whole education system. As a precondition of sport course setting and optimization, accurate recognition of personalized sport preference of each student is becoming a crucial and emergent task in front of education systems. Fortunately, past sports training records of students have provided a theoretically feasible evaluation basis to cluster the students and then infer their personalized sports preferences accordingly.
However, the past sports training records of students are often accumulated and stored in a central cloud platform for years and as a result, the data volume is often big enough [7], which probably leads to a long response time in student clustering and subsequent sports preference identification. In addition, past sports training records often contain some sensitive user information of students, which probably discloses student privacy if we cannot protect the data transmitted to the cloud platform well [8,9,10]. Considering these two challenges, a privacyaware and efficient student clustering approach, named PESC is proposed for sport preference discovery and mining in cloud environment.
Concretely, in the proposed PESC approach, each student is firstly assigned a sport index (with less private information) by analyzing his or her past sports training records registered in the cloud platform through hash projection operations. Secondly, similar students are clustered into different groups based on students’ lesssensitive sport indexes recorded in the cloud platform (since indexbased similarity calculation is rather quick, we can guarantee a high clustering efficiency in cloud). Third, we determine the personalized sport preferences of each student based on their belonged groups or clusters. At last, we demonstrate the effectiveness and efficiency of PESC through an intuitive example constructed from real world and a set of simulation experiments.
The major contributions of this article are detailed as below.

(1)
Past sports training records as well as their respective time tags are used as a feasible and promising evaluation basis to infer the personalized sport item preferences of students in cloudenabled smart education.

(2)
Hash index mechanism is recruited here to cluster the students into different groups based on the past sports training records registered in the cloud platform. In concrete, firstly, we convert the sensitive user data into a lesssensitive user index through a kind of hash mapping process. Secondly, we use the lesssensitive user indexes to cluster users into different groups without disclosing much user privacy. This way, we can guarantee the user privacy is secure during the accurate user clustering process.

(3)
We present a case study extracted from real world applicable scenarios to demonstrate the detailed steps of the proposed PESC solution. In addition, a set of simulated experiments are also provided to show the feasibility of the proposed PESC solution.
The rest of paper is briefly introduced as follows. Literature investigation is conducted in Related literature Section. A motivating example Section clarifies the research background and significance of this paper with a vivid example. Our solution: PESC Section introduces the detailed student clustering process as well as the student sport preference recognition process. A case study is presented in Case study Section and evaluation is presented in Evaluation Section. We summarize the paper and point out the possible research directions in Conclusions Section.
Related literature
In this section, we investigate the current research associated with sport clustering in cloud computing.
In [11], the authors mainly discuss how the socioeconomic proximity between organizations or individuals affects the development of sports clusters. In order to solve this problem, the authors mainly investigate the sports groups of surfing and sailing, and puts forward a twostep model of cluster development. The purpose of [12] is to explore whether and how sports clusters correlate with community resilience across regions. To answer this question, the authors apply geographically weighted regression and visualization techniques to the macro data detection of community resilience. The results show that the community resilience of sports industry cluster is significantly correlated with the community resilience, and the two are strongly positively correlated. In [13], the purpose is to use the Sports eFANgelism scale to classify the fans in the process of The Korean league and analyze the differences between groups. Through cluster analysis, three groups were identified according to the level of league fans’ behavior. At the same time, through variance analysis, each group had obvious differences in four kinds of preaching behaviors. In recent years, the holding of a large regional event is considered to have a potentially positive impact on the region’s economy [14]. Concretely, the authors attempt to delve deeper into this field, focusing on the impact of participatory events on participants themselves. The authors use a cluster analysis procedure to perform a combinatorial analysis of participants, thus confirming and discussing the existence of various effects associated with participation events.
In [15], the authors mainly survey and give feedback to professional athletes. In this study, professional identity, sports identity and selfefficacy are measured, and cluster analysis is used to analyze the survey results. It is proved that identity and selfefficacy are important factors for athletes to choose dual career path. In [16], the aim was to replicate a controlled cluster experiment that demonstrated that behavioral skill training significantly improved motor behavior and selfefficacy in adolescents. Results from repeated trials show heterogeneity in the effectiveness of sportsbased interventions, even among apparently similar populations. In [17], physical education has become an important course, and physical education has become an important teaching mode. This paper mainly studies the influence of physical education on college students’ physical quality and physical activity level. In this study, the authors used a cluster randomized trial to verify. The results show that under the environment of limited control and selfevolution, appropriate physical education has a positive effect on the development of college students’ physical quality. Nowadays, many young athletes are found to be taking performanceenhancing drugs [18]. In order to more accurately understand the causes of adolescent doping, this paper verified the Sports Drug Control model of Adolescent athletes (SDCMAA), and modified the SDCMAA model according to the experimental situation. The results show that there is a close relationship between the doping situation of athletes and the control mode of sports drugs, the cluster effect and the value of athletes’ own norms.
With the above literature review, we can simply conclude that existing literatures about population clustering based on sports information often fall short in time efficiency and privacy protection especially in the big data context. Considering this limitation, we propose a highly efficient student clustering approach with privacy protection, named PESC in this paper.
A motivating example
As shown in Fig. 1, three students (i.e., Lucy, Lily and John) as well as their past sport score records are presented and stored in local servers A, B and C, respectively. For example, Lucy took three kinds of sport items (i.e., volleyball, boating and skating) in the past and her volleyball score in 2020 is 80, boating score in 2021 is 90 and skating score in 2020 is 70. Likewise, the scores of the other two students are also presented in Fig. 1. For uniform data analysis and mining, the scores of students stored in different local servers A, B and C need to be transmitted to a remote cloud platform.
As mentioned in Fig. 1, different users’ sporttimescore records are stored in different local servers and finally transmitted to a central cloud platform for uniform processing. During such a data transmission process, user privacy contained in the users’ sporttimescore records are probably disclosed to other parties. With the known sporttimescore records of students in the cloud platform shown in Fig. 1, we can divide the students into different groups and infer their respective sport preferences for better sport training. However, two challenges often exist in the above student clustering and sport preference recognition process. First, too many students (although only three students are exampled in Fig. 1, the student volume is often big enough in practice) as well as their respective sport score records are involved in uniform data analysis in cloud platform, which often consumes much computational time and leads to a longer waiting time (specifically, the accumulated sporttimescore records of students are continuously growing with time elapsing, which often calls for more processing time). Second, the sporttimescore records of students often contain partial privacy of the involved students and therefore, they are often reluctant to publish their sensitive sporttimescore records to the public.
Motivated by the above two challenges, a privacyaware and efficient student clustering approach, named PESC is proposed here for achieving uniform data analysis and mining in cloud environment. The concrete steps of PESC will be clarified with more details in the following sections.
Our solution: PESC
Next, we briefly introduce the major steps of the proposed PESC solution: firstly, we convert the sensitive user data into a lesssensitive user index through a kind of hash mapping process; secondly, we use the lesssensitive user indexes to cluster users into different groups without disclosing much user privacy; third, we determine the personalized sport preferences of each student based on their respective belonged groups or clusters. Concrete description of the PESC solution can be found in Fig. 2.
Step 1: Sport index assignment
In this step, we create and assign a sport index to each student based on his or her sport score records in the past registered in the cloud platform. Here, the index is created by a kind of hashing technique [19,20,21], whose reason is as follows: distributed data integration from mobile clients to cloud platform is often not secure [22,23,24,25,26] while hashing has been proven an effective data protection technology. Please note that the sport score records are timeaware as exampled in Fig. 1. Therefore, to smooth the subsequent sport index creation and assignment procedure, we first model the timeaware sport score records with the sporttimescore matrix M in Eqs. (1)(2). Here, we assume there are n students, i.e., \(Stu\_set\) = \(\{s_{1}, s_{2},\ldots , s_{n}\}\), m kinds of sport items, i.e., \(SI\_set = \{ si_{1},\ldots , si_{m} \}\) and l time slots, i.e., \(T\_set = \{ t_{1},\ldots , t_{l} \}\). Thus, each column in matrix M in Eq. (1) denotes a student in \(Stu\_set\), each row of \(s_i\) in Eq. (2) indicates a time slot and each column of \(s_i\) in Eq. (2) represents a sport item. With the above formulation, we can find that each entry \(q_{i,j,k}\) \(( 1 \le i \le n, 1 \le j \le m, 1 \le k \le l )\) in Eq. (2) means student \(s_{i}'s\) score of sport item \(si_j\) at time slot \(t_k\), which has been exampled in Fig. 1.
Specifically, if student \(s_i\) has not taken sport item \(si_{j}\) at time slot \(t_k\), then the corresponding entry \(q_{i,j,k}\) in Eq. (2) is equal to zero, i.e., \(q_{i,j,k}= 0\) holds. Next, we create and assign a sport index \(Y_i\) for each student \(s_{i} ( 1 \le i \le n)\) in matrix M. Concretely, we convert the matrix \(s_{i}\) in Eq. (2) into a corresponding vector \(v(s_i)\) which is formalized in Eq. (3). Please note that as Eq. (3) indicates, \(v(s_{i})\) is an l*mdimensional vector. For brief formalization, we assume \(l*m = Q\) in the following discussions. Thus, \(v(s_i)\) is a Qdimensional vector.
Then we randomly produce a new vector which is also Qdimensional, denoted by \(\phi = (\omega _{1},\ldots , \omega _{Q})\), following the rule in Eq. (4) where each \(\omega _{\varphi }\) is randomly produced between 1 and 1. Next, with two Qdimensional vectors \(v(s_i)\) and \(\phi\), a product operation is adopted in Eq. (5), through which a real value \(y_{i}\) is obtained. Furthermore, a sign operation is adopted in Eq. (6) which converts the real value \(y_{i}\) into a Boolean one. The reason that we use binary mapping in Eq. (6) is that binary mapping is very timeefficient and effective [27,28,29,30], which has been validated and proved by many other literatures.
We repeat the above conversion process described in Eqs. (4)(6) p times and then for each student \(s_{i}\) \(( 1 \le i \le n )\) in matrix M, we obtain p Boolean values: \(y_{i(1)},\ldots , y_{i(p)}\). Thus, a pdimensional vector \(Y_{i}\) is obtained as formalized in Eq. (7). In this paper, \(Y_{i}\) is the sport index for student \(s_{i}\) and we can use \(Y_{i}\) to represent \(s_{i}\) for further calculation in the subsequent discussions in Step 2 and Step 3. Since \(Y_{i}\) is an index containing little sensitive information of student \(s_{i}\), the following calculations associated with \(Y_{i}\) can be considered privacyfree. This way, the students’ sensitive information contained in historical records can be protected well.
Step 2: Student clustering
Through Step 1, we have derived the studentindex correspondence relationships, i.e., each student \(s_{i}\) is corresponding to an index \(Y_{i}\). We summarize the above correspondence relationships, i.e., \(s_{i} \rightarrow Y_{i}\) with a table shown in Table 1. Here, the hash table is recorded in the cloud platform.
To minimize the “falsenegative” and “falsepositive” probability in subsequent student clustering results, we create R hash tables instead of one table, which are presented in Table 2. Thus, each student \(s_i\) will be corresponding to R indexes: \(Index_{1},\ldots , Index_{R}\). Then with Table 2, we can cluster the n students in \(Stu\_set\) into different groups \(G = \{g_{1}, g_{2},\ldots \}\), which is based on the judgment condition in Eqs. (8)(10). Here, \(s_x\), \(s_z \in Stu\_set\). Please note that the time complexity of the judgment condition of Eqs. (8)(10) is very low and therefore, it is typically suitable for the clustering scenarios in big data context since big data processing often involves high computational costs [31,32,33,34,35] and effective computing offloading capabilities [36,37,38,39,40].
Step 3: Sport preference recognition
For two students \(s_x\) and \(s_z ( \in Stu\_set)\), if they are divided into an identical group \(g \in G\), then they are probably with similar sport preferences. In this situation, if \(s_z\) likes a sport item \(s_i \in SI\_set\), then \(s_x\) likes the sport item \(s_i\) with high probability, vice versa. This way, we can infer the sport preferences of each student in universities since people belonging to same group often share the same or close preferences with high probability [41,42,43,44].
Here, please note that if there are no other students belonging to the group g that contains student \(s_x\), then an exception occurs since \(s_x\) has no similar students. In this situation, we loosen the similar student judgment conditions in Eqs. (8)(10). Concretely, as Eq. (11) shows, if \(s_z\) has the minimal Hamming Distance with \(s_x\) in any of the R hash tables, then \(s_z\) is taken as the similar student of \(s_x\) and thus, if \(s_z\) likes a sport item \(s_i \in SI\_set\), then \(s_x\) likes the sport item \(s_i\) with high probability, vice versa.
In our proposal, each user is assigned an index with less privacy through a hash mapping process, which can be done in an offline manner since the indexes can be generated beforehand [45,46,47]. Therefore, the time complexity of this conversion process is approximately zero. Afterwards, user clustering can be performed based on an online user index matching process whose time complexity is approximately O(1). Therefore, the total time cost of our proposal is rather low.
Formally, our proposed PESC solution is described with Algorithm 1.
Case study
A case study is constructed to show the concrete steps of the proposed PESC solution when we need to cluster students according to their historical sport records in cloud platform and infer their respective sport preferences. Here, we assume there are three students: \(s_1\), \(s_2\), \(s_3\) and each student’s historical sport preferences (totally four sport items) with time (totally two time slots) are presented as follows.
Then we convert the three 2*4 matrices corresponding to \(s_1\), \(s_2\), \(s_3\) into three 8dimensional vectors, i.e., \(v(s_1)\), \(v(s_2)\), \(v(s_3)\). Next, we randomly generate two hash tables, each of which is with three vectors \(\phi _1\), \(\phi _2\) and \(\phi _3\) (as shown in Table 3). The according to Eqs. (4)(7), we can derive the indexes of the three students \(s_1\), \(s_2\), \(s_3\), i.e., \(y_1\), \(y_2\), \(y_3\) in the two hash tables, respectively. Since \(y_1\) = \(y_3\) holds in the second hash table (i.e., \(y_1\) = \(y_3\) = (0, 0, 1)), student \(s_3\) is similar with \(s_1\) with high probability. Therefore, the sport items preferred by \(s_3\) is also preferred by \(s_1\) with high probability. This way, we can achieve privacyaware and efficient student clustering and sport preference inference for sport training in universities.
Evaluation
Through a timeaware quality performance dataset WSDREAM, we design a set of experiments to show the effectiveness and efficiency of the PESC solution. We compare the performances (i.e., MAE for measuring clustering accuracy and time cost for measuring clustering efficiency) of PESC with another two existing solutions: UCF (userbased collaborative filtering) and ICF (itembased collaborative filtering). Concretely, we measure the performances of three solutions under different parameter settings. Here, two parameters are recruited: number of students varied from 100 to 300 and matrix density varied from 5% to 40%. Experiments are run in a laptop with 3.20 GHz CPU and 4.0 GB RAM.
Concrete experiment results are reported in the following four profiles.

(1)
MAE comparison
We measure the clustering accuracy of three solutions via MAE performances [48,49,50,51] under different parameter settings. In concrete, the MAE of three solutions with respect to the number of students is presented in Fig. 3a. As Fig. 3a indicates, the MAE of ICF solution is the highest, which means that the clustering accuracy of ICF is often poor. On the contrary, PESC and UCF solutions both perform well in MAE, which indicates that their clustering accuracy is often high. Furthermore, the MAE value of PESC is close to that of UCF. This means that PESC can achieve approximate clustering accuracy with the baseline UCF solution, because the hash index technique adopted in PESC can guarantee to output the most similar students since the has index technique formalized in Eqs. (4)(11) is with a good property of similarity keeping. Therefore, our PESC performs well in terms of clustering accuracy.
Moreover, the MAE of three solutions with respect to matrix density is presented in Fig. 3b. As Fig. 3b shows, the MAE of ICF solution is also the highest, which indicates a poor clustering accuracy. On the contrary, PESC and UCF solutions both perform well in MAE, which indicates that their clustering accuracy is often high. Furthermore, like Fig. 3a, the MAE value of PESC is close to that of UCF, which indicates an approximate clustering accuracy between our PESC and baseline UCF. The reason is the same as that analyzed in Fig. 3a, which will not be repeated here.

(2)
Time cost comparison
Time cost is a key metric to indicate the algorithm performance especially in the big data environment [52,53,54,55,56]. Here, we measure the clustering efficiency of three solutions via time cost metric under different parameter settings. In concrete, the clustering efficiency of three solutions with respect to the number of students is presented in Fig. 4a. As Fig. 4a indicates, the time cost of ICF solution is the highest since more computational cost is needed for ICF when the number of students is increasing. In addition, the time cost of UCF solution is smaller than ICF, which is still high since all the students need to take part in the similarity calculation. Compared to UCF and ICF, our proposed PESC performs the best in terms of time cost since the hash index technique recruited in PESC is quite efficient due to its low complexity of O(1). Another observation from Fig. 4a is that the time cost of UCF and ICF both increases with the growth of the number of students; while the time cost of our PESC stays approximately stable with the increment of student volume, which indicates a good scalability of PESC in coping with big data.
In addition, the clustering efficiency of three solutions with respect to matrix density is presented in Fig. 4b. Figure 4b show a close result with that of Fig. 4a: the time cost of PESC performs better than UCF and ICF; the time cost of UCF and ICF both increases with the rising of matrix density, while the time cost of PESC stays approximately stable with the growth of matrix density. The reason is the same as that analyzed in Fig. 4a, which will not be repeated here.

(3)
MAE of PESC w.r.t. number of hash functions
In this profile, we measure the clustering accuracy of PESC via MAE metric under different parameter settings. Here, the parameter is the number of hash functions. Measurement results are presented in Fig. 5. As can be seen from Fig. 5, the MAE of PESC approximately decreases with the growth of the number of hash functions. This result can be explained by the inherent property of the hash index technique adopted in PESC. In concrete, when more hash functions are used to generate the hash indexes of students according to Eqs. (4)(7), the clustering conditions are very strict and therefore, only a smaller number of really similar students are clustered into a same group. In this situation, the clustering accuracy is improved and MAE is decreased accordingly.

(4)
Time cost of PESC w.r.t. number of hash functions
In this profile, we measure the clustering efficiency of PESC under different parameter settings. Here, the parameter is the number of hash functions. Concrete results are presented in Fig. 6. As Fig. 6 indicates, the time cost of PESC first decreases with the growth of the number of hash functions and then stays approximately stable with the growth of the number of hash functions. This result can also be explained by the property of the hash index technique adopted in PESC. Concretely, when more hash functions (e.g., from 2 functions to 4 functions) are recruited in the generation of hash indexes of students according to Eqs. (4)(7), the clustering conditions become more strict and therefore, only a smaller number of really similar students are clustered into a same group. In this situation, the clustering efficiency is improved significantly. Furthermore, when the number of hash functions continue to become larger (e.g., from 4 functions to 10 functions), the students belonging to same clusters stay stable even the filtering conditions are becoming stricter. In this situation, the time cost also stays approximately stable.
Conclusions
With the wide adoption of health and sport concepts in human society, how to effectively analyze the personalized sports preferences of each student has become a crucial and emergent task in education. In this situation, past sports training records of students recorded in a central cloud platform have provided a theoretically feasible evaluation basis to cluster the students and then infer their respective sports preferences accordingly. However, the past sports training records of students stored in the cloud platform are often accumulated for years and therefore, the data volume is often large, which often leads to a long processing time of cloud platform for student clustering and sports preference identification. In addition, the past sports training records of students registered in the cloud platform often contain some sensitive user information, which probably discloses user privacy if we cannot protect the data well. Considering these two challenges, a privacyaware and efficient student clustering approach, i.e., PESC is proposed for sport training cloudassisted smart education. In concrete, we use hash mapping operations to secure the sensitive user privacy. Firstly, we convert the sensitive user data into a lesssensitive user index through a kind of hash mapping process. Secondly, we use the lesssensitive user indexes to cluster users into different groups without disclosing much user privacy. This way, we can guarantee the user privacy is secure during the accurate user clustering process. We demonstrate the effectiveness of PESC through an intuitive example and a set of experiments.
However, there are some classic privacy protection solutions besides hash adopted in this paper, such as encryption, anonymization, differential privacy, etc [57,58,59]. In the future work, we will further compare PESC with other privacypreserving techniques through experiment comparison. In addition, for practical sportrelated applicable scenarios, multiple dimensions as well as their weights are meaningful and crucial. Therefore, we will refine PESC by considering weight information of different sport dimensions involved in student clustering scenarios. At last, we only consider one kind of user data (i.e., score) for simplicity in user clustering, while neglecting the diversity of user data [60,61,62,63]. In the future work, we will take the data type diversity into consideration to make our research more robust and applicable.
Availability of data and materials
The WSDREAM dataset: https://wsdream.github.io/
Abbreviations
 PESC:

Privacyaware and efficient student clustering approach;
 SDCMAA:

Sports Drug Control model of Adolescent athletes;
 UCF:

Userbased Collaborative Filtering;
 ICF:

Itembased Collaborative Filtering.
References
Li K, Zhao J, Hu J, et al. Dynamic Energy Efficient Task Offloading and Resource Allocation for NOMAenabled IoT in Smart Buildings and Environment. Building and Environment; 2022. https://doi.org/10.1016/j.buildenv.2022.109513.
Kumari R, Kumar S, Poonia RC, Singh V, Raja L, Bhatnagar V et al (2021) Analysis and predictions of spread, recovery, and death caused by COVID19 in India. Big Data Min Analytics 4(2):65–75
Yang Y (2015) Attributebased data retrieval with semantic keyword search for ehealth cloud. J Cloud Comput 4(1):1–6
Kong L, Wang L, Gong W, Yan C, Duan Y, Qi L (2021) LSHaware multitype health data prediction with privacy preservation in edge environment. World Wide Web. https://doi.org/10.1007/s1128002100941z:116
Xu X, Jiang Q, Zhang P, Cao X, Khosravi MR, Alex LT, Qi L, Dou W (2022) Game Theory for Distributed IoV Task Offloading with Fuzzy Neural Network in Edge Computing. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ20223158000:11
Xu X, Tian H, Zhang X, Qi L, He Q, Dou W (2022) DisCOV: Distributed COVID19 Detection on XRay Images With EdgeCloud Collaboration. IEEE Trans Serv Comput 15(3):1206–19
Nitu P, Coelho J, Madiraju P (2021) Improvising personalized travel recommendation system with recency effects. Big Data Min Analytics 4(3):139–154
Cai Z, Xiong Z, Xu H, Wang P, Li W, Pan Y (2021) Generative adversarial networks: A survey toward private and secure applications. ACM Comput Surv (CSUR). 54(6):1–38
Kou H, Liu H, Duan Y, Gong W, Xu Y, Xu X et al (2021) Building trust/distrust relationships on signed social service network through privacyaware link prediction process. Appl Soft Comput 100:106942
Zheng X, Cai Z (2020) PrivacyPreserved Data Sharing Towards Multiple Parties in Industrial IoTs. IEEE J Sel Areas Commun 38(5):968–979
Gerke A, Dalla Pria Y (2018) Cluster concept: Lessons for the sport sector? Toward a twostep model of sport cluster development based on socioeconomic proximity. J Sport Manag 32(3):211–226
Kim C, Kim J, Jang S (2021) Sport clusters and community resilience in the United States. J Sport Manag 35(6):566–580
Park S, Kim S, Chiu W (2021) Segmenting sport fans by eFANgelism: a cluster analysis of South Korean soccer fans. Manag Sport Leis 115
Hautbois C, Djaballah M, Desbordes M (2020) The social impact of participative sporting events: a cluster analysis of marathon participants based on perceived benefits. Sport Soc 23(2):335–353
Cartigny E, Fletcher D, Coupland C, Bandelow S (2021) Typologies of dual career in sport: A cluster analysis of identity and selfefficacy. J Sports Sci 39(5):583–590
Schnider L, Schilling R, Cody R, Kreppke JN, Gerber M (2022) Effects of behavioural skill training on cognitive antecedents and exercise and sport behaviour in high school students: a clusterrandomised controlled trial. Int J Sport Exerc Psychol 20(2):451–473
Choi SM, Sum KWR, Leung FLE, Wallhead T, Morgan K, Milton D et al (2021) Effect of sport education on students’ perceived physical literacy, motivation, and physical activity levels in university required physical education: a clusterrandomized trial. High Educ 81(6):1137–1155
Nicholls AR, Levy AR, Meir R, Sanctuary C, Jones L, Baghurst T et al (2020) The susceptibles, chancers, pragmatists, and fair players: an examination of the sport drug control model for adolescent athletes, cluster effects, and norm values among adolescent athletes. Front Psychol 11:1564
Qi L, Yang Y, Zhou X, Rafique W, Ma J (2021) Fast Anomaly Identification Based on MultiAspect Data Streams for Intelligent Intrusion Detection Toward Secure Industry 4.0. IEEE Trans Ind Inf https://doi.org/10.1109/TII20213139363
Wang F, Li G, Wang Y, Rafique W, Khosravi MR, Liu G et al (2022) Privacyaware traffic flow prediction based on multiparty sensor data with zero trust in smart city. ACM Trans Internet Technol (TOIT). https://doi.org/10.1145/3511904
Qi L, Hu C, Zhang X, Khosravi MR, Sharma S, Pang S et al (2021) Privacyaware data fusion and prediction with spatialtemporal context for smart city industrial environment. IEEE Trans Ind Inf 17(6):4159–4167
Huang J, Tong Z, Feng Z () Geographical POI recommendation for Internet of Things: A federated learning approach using matrix factorization. Int J Commun Syst e5161. https://doi.org/10.1002/dac.5161
Yuan L, He Q, Tan S, Li B, Yu J, Chen F et al (2021) Coopedge: A decentralized blockchainbased platform for cooperative edge computing. Proceedings of the Web Conference 2021:2245–2257
Chen Y, Zhao F, Lu Y, Chen X () Dynamic task offloading for mobile edge computing with hybrid energy supply. Tsinghua Sci Technol 10. https://doi.org/10.26599/TST.2021.9010050
Zhou X, Li Y, Liang W (2020) CNNRNN based intelligent recommendation for online medical prediagnosis support. IEEE/ACM Trans Comput Biol Bioinforma 18(3):912–921
Shao Q, Yu R, Zhao H, Liu C, Zhang M, Song H et al (2021) Toward intelligent financial advisors for identifying potential clients: a multitask perspective. Big Data Min Analytics 5(1):64–78
Catlett C, Beckman P, Ferrier N, Nusbaum H, Papka ME, Berman MG et al (2020) Measuring Cities with SoftwareDefined Sensors. J Soc Comput 1(1):14–27
Sandhu AK (2022) Big data with cloud computing: Discussions and challenges. Big Data Min Analytics 5(1):32–40
Bouras MA, Farha F, Ning H (2020) Convergence of computing, communication, and caching in Internet of Things. Intell Converged Netw 1(1):18–36
Xu X, Fang Z, Zhang J, He Q, Yu D, Qi L, et al (2021) Edge Content Caching with Deep Spatiotemporal Residual Network for IoV in Smart City. ACM Trans Sen Netw 17(3). https://doi.org/10.1145/3447032
Chen Y, Gu W, Li K () Dynamic task offloading for Internet of Things in mobile edge computing via deep reinforcement learning. Int J Commun Syst e5154. https://doi.org/10.1002/dac.5154
Chen H, Yang C, Zhang X, Liu Z, Sun M, Jin J (2021) From Symbols to Embeddings: A Tale of Two Representations in Computational Social Science. J Social Comput 2(2):103–156
Qi L, Lin W, Zhang X, Dou W, Xu X, Chen J (2022) A Correlation Graph based Approach for Personalized and Compatible Web APIs Recommendation in Mobile APP Development. IEEE Trans Knowl Data Eng 11
Chen Y, Liu Z, Zhang Y et al (2021) Deep reinforcement learningbased dynamic resource management for mobile edge computing in industrial internet of things. IEEE Trans Ind Inf 17(7):4925–4934
Dai H, Xu Y, Chen G, Dou W, Tian C, Wu X et al (2022) (2022) ROSE: Robustly Safe Charging for Wireless Power Transfer. IEEE Trans Mob Comput 21(6):2180–2197
Chen Y, Zhao F, Chen X, Wu Y (2022) Efficient MultiVehicle Task Offloading for Mobile Edge Computing in 6G Networks. IEEE Trans Veh Technol 71(5):4584–4595
Zhou X, Liang W, Li W, Yan K, Shimizu S (2022) Wang KIK (2022) Hierarchical Adversarial Attacks Against GraphNeuralNetworkBased IoT Network Intrusion Detection System. IEEE Int Things J 9(12):9310–9319
Zhu K, Zhang T (2021) Deep reinforcement learning based mobile robot navigation: A review. Tsinghua Sci Technol 26(5):674–691
Ying C, Hua X, Zhuo M, et al. (2022) CostEfficient Edge Caching for NOMAenabled IoT Services. China Commun
Zhou J, Li L, Vajdi A, Zhou X, Wu Z (2021) TemperatureConstrained Reliability Optimization of Industrial CyberPhysical Systems Using Machine Learning and Feedback Control. IEEE Trans Autom Sci Eng 112. https://doi.org/10.1109/TASE.2021.3062408.
Wernke SA (2022) Explosive Expansion, Sociotechnical Diversity, and Fragile Sovereignty in the Domain of the Inka. J Soc Comput 3(1):57–74
Xu Y, Liu Z, Zhang C, Ren J, Zhang Y, Shen X (2022) BlockchainBased Trustworthy Energy Dispatching Approach for High Renewable Energy Penetrated Power Systems. IEEE Int Things J 9(12):10036–10047
Li T, Li C, Luo J, Song L (2020) Wireless recommendations for Internet of vehicles: Recent advances, challenges, and opportunities. Intell Converged Netw 1(1):1–17
Zhou X, Yang X, Ma J, Wang KIK (2021) Energy Efficient Smart Routing Based on Link Correlation Mining for Wireless Edge Computing in IoT. IEEE Int Things J 11. https://doi.org/10.1109/JIOT.2021.3077937.
Gu R, Zhang K, Xu Z, Che Y, Fan B, Hou H, et al. (2022) Fluid: Dataset Abstraction and Elastic Acceleration for Cloudnative Deep Learning Training Jobs. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE). p. 21822195
Cai Z, He Z (2019) Trading Private Range Counting over Big IoT Data. In: 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). p. 144153
Xu Y, Zhang C, Wang G, Qin Z, Zeng Q (2021) A BlockchainEnabled Deduplicatable Data Auditing Mechanism for Network Storage Services. IEEE Trans Emerg Top Comput 9(3):1421–1432
Cheung J (2021) Real Estate Politik: Democracy and the Financialization of Social Networks. J Soc Comput 2(4):323–336
Zhang C, Xu Y, Hu Y, Wu J, Ren J, Zhang Y (2021) A BlockchainBased MultiCloud Storage Data Auditing Scheme to Locate Faults. IEEE Trans Cloud Comput 11. https://doi.org/10.1109/TCC.2021.3057771.
Cai P, Zhang Y (2020) Intelligent cognitive spectrum collaboration: Convergence of spectrum sensing, spectrum access, and coding technology. Intell Converged Netw 1(1):79–98
Xu Y, Ren J, Zhang Y, Zhang C, Shen B, Zhang Y (2020) Blockchain Empowered Arbitrable Data Auditing Scheme for Network Storage as a Service. IEEE Trans Serv Comput 13(2):289–300
Huang J, Lv B, Wu Y et al (2022) Dynamic Admission Control and Resource Allocation for Mobile Edge Computing Enabled Small Cell Network. IEEE Trans Veh Technol 71(2):1964–1973
Zhang K, Tian Z, Cai Z, Seo D (2021) Linkprivacy preserving graph embedding data publication with adversarial learning. Tsinghua Sci Technol 27(2):244–256
Zeng Q, Zhou Q, He X, Sun Y, Li X, Chen H (2021) Polar Codes: Encoding/Decoding and RateCompatible Jointly Design for HARQ System. Intelligent and Converged Networks 2(4):334–346
Gu R, Chen Y, Liu S, Dai H, Chen G, Zhang K et al (2022) Liquid: Intelligent Resource Estimation and NetworkEfficient Scheduling for Deep Learning Jobs on Distributed GPU Clusters. IEEE Transactions on Parallel and Distributed Systems 33(11):2808–2820
Xu J, Li D, Gu W et al (2022) UAVassisted Task Offloading for IoT in Smart Buildings and Environment via Deep Reinforcement Learning. Building and Environment. https://doi.org/10.1016/j.buildenv.2022.109218
Zhou J, Zhang M, Sun J, Wang T, Zhou X, Hu S (2022) DRHEFT: DeadlineConstrained ReliabilityAware HEFT Algorithm for RealTime Heterogeneous MPSoC Systems. IEEE Transactions on Reliability. 71(1):178–189
Dai H, Wang X, Lin X, Gu R, Shi S, Liu Y, et al. (2021) Placing Wireless Chargers with Limited Mobility. IEEE Trans Mob Comput 11. https://doi.org/10.1109/TMC.2021.3136967.
Cai Z, Zheng X (2020) A Private and Efficient Mechanism for Data Uploading in Smart CyberPhysical Systems. IEEE Trans Netw Sci Eng 7(2):766–775
Zhou X, Liang W, Kevin I, Wang K, Huang R (2018) Jin Q (2018) Academic influence aware and multidimensional network analysis for research collaboration navigation based on scholarly big data. IEEE Trans Emerg Top Comput 9(1):246–257
Zhang W, Li Z, Chen X (2021) Qualityaware user recruitment based on federated learning in mobile crowd sensing. Tsinghua Sci Technol 26(6):869–877
Zhou X, Xu X, Liang W, Zeng Z, Yan Z (2021) DeepLearningEnhanced Multitarget Detection for End–Edge–Cloud Surveillance in Smart IoT. IEEE Int Things J 8(16):12588–12596
Zhou J, Cao K, Zhou X, Chen M, Wei T, Hu S (2022) ThroughputConscious Energy Allocation and ReliabilityAware Task Assignment for Renewable Powered InSitu Server Systems. IEEE Trans Comput Aided Des Integr Circ Syst 41(3):516–529
Acknowledgements
We would like to thank the provider of the WSDREAM dataset.
Author information
Authors and Affiliations
Contributions
Guoyan Diao: Idea and writing. Fang Liu: Formulation and motivation. Zhikai Zuo: Algorithm and experiment design. Mohammad Kazem Moghimi: Literature investigation and English writing. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Diao, G., Liu, F., Zuo, Z. et al. Privacyaware and Efficient Student Clustering for Sport Training with Hash in Cloud Environment. J Cloud Comp 11, 52 (2022). https://doi.org/10.1186/s13677022003252
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13677022003252
Keywords
 Cloud platform
 User clustering
 Privacy
 Efficiency
 Local server
 Hash