Predicting the individual effects of team competition on college students’ academic performance in mobile edge computing

Mobile edge computing (MEC) has revolutionized the way of teaching in universities. It enables more interactive and immersive experiences in the classroom, enhancing student engagement and learning outcomes. As an incentive mechanism based on social identity and contest theories, team competition has been adopted and shown its effectiveness in improving students’ participation and motivation in college classrooms. However, despite its potential benefit, there are still many unresolved issues: What type of students and teams benefit more from team competition? In what teaching context is team competition more effective? Which competition design methods better increase student academic performance? Mobile edge computing provides the ability to obtain the data of the teaching process and analyze the causal effect between team competition and students’ academic performance. In this paper, the authors first design a randomized field experiment among freshmen enrolled in college English courses. Then, the authors analyze the observation data collected from the online teaching platform, and predict individual treatment effects of academic performance in college English through linear and nonlinear machine learning models. Finally, by carefully investigating features of teams and individual students, the prediction error is reduced by up to 30%. In addition, through interpreting the predictive models, some valuable insights regarding the practice of team competition in college classrooms are discovered.


Introduction
In recent years, with proliferation of mobile devices, MEC has been widely adopted in various industries [1].In the field of education, MEC has revolutionized the way of teaching in universities.It enables more interactive and immersive experiences in the classroom, enhancing student engagement and learning outcomes.By bringing computational capabilities closer to end-users, MEC facilitates the seamless integration of digital resources within the educational domain [2] Within this context, MEC terminals act as intelligent hubs which capture valuable teaching data in real-time.Team competition strategy [3,4], serving as motivational mechanisms, has found extensive adoption in various educational tiers.As a supplement to traditional classroom instruction, team competition in college English teaching is often used to enhance student engagement, teamwork, and language proficiency.In this teaching method, students are divided into teams and engage in various language-based tasks and challenges.These can include debates, presentations, role-plays, quizzes, and other interactive activities that require the application of English language skills.This strategy, is not limited to classroom settings but has been further developed, particularly with the support of MEC.It fosters positive competition among students, enhancing learning efficiency.Additionally, it provides teachers with greater data support, helping them better understand students' learning needs.Hence, it plays a significant role in modern education.
Despite of the potential benefit of team competition, plenty of unknowns remain.Because of the huge heterogeneity among the schools, the majors, the classes and the students, which may lead to significant variations in students' motivation and academic performances?What types of students and teams (i.e., gender, major, grade) benefit more from team competition?Which teaching design methods (i.e., team formation) better increase students' academic performance?In what teaching context team competition is more effective?Whether there is a causality between in-class activities (i.e., discussion, quiz, homework, etc.) and academic performance.Understanding the causal effects between these factors and students' academic performance can help teachers optimize the practice of team competitions in college classrooms for different types of students, thereby improving students' motivation and academic performance.
However, it is challenging to answering these questions.First, there are few real-world data which covers the whole team competition learning process.
Controlled field experiments are necessary to collect enough data for this research.Second, measuring the causal effects between the team competition mechanism and students' academic performance is intrinsically difficult [5].It requires a proper definition of individual performance measures and prediction targets [6].Third, the variable space to describe the characteristics of context, students, team and teaching activities is high-dimensional [7][8][9].Moreover, there are a lot of complex relationships among them.Domain knowledge and data analytics are both needed to identify the potential predictive factors [10][11][12].
In this paper, a novel approach is proposed to attack these challenges, as shown in Fig. 1.A randomized field test among freshmen enrolled in college English course is first developed, then the individual treatment effect of team completion on students' academic performance is predicted.Moreover, through interpreting the predictive models, the authors investigate the most significant factors in the practice of team completion in college classrooms.Since students' performance in teaching activities i.e., answer race, discussion, quiz and homework are distributed over long periods of a semester, The data from their homework results is consolidated into an online teaching platform, which serves as a centralized cloud platform.This enables more comprehensive analysis and mining of the data.
Concretely, contributions include: Fig. 1 An overview of proposed approach (1) The authors employ MEC terminals to capture realtime valuable teaching data, followed by the design and execution of a controlled field experiment aimed at collecting comprehensive data throughout the entire process of team competition learning.(2) Leveraging the capabilities of MEC infrastructure, the problem is framed as a prediction task and employ machine learning models to forecast the individual treatment effect of team competitions on students.
(3) The prediction model is interpreted to identify the most important factors in team competition learning.

MEC in education
MEC enables real-time data analysis and processing [13], making it possible to gather and analyze learner data promptly.This information can be utilized to personalize the learning experience, adapting content and recommendations based on individual needs and preferences [14][15][16].This can be beneficial for real-time collaboration tools [17], video streaming, and online interactive learning platforms, providing a seamless and immersive learning experience.MEC has the potential to revolutionize education by improving access, personalizing learning experiences, and enabling innovative technologies [18].By harnessing the power of edge computing, educational institutions can enhance their digital infrastructure and provide more efficient and effective learning environments.

Team competition
As an incentive mechanism based on social identity and contest theories, team competitions have been increasingly applied in many fields.It has shown that team competition can not only effectively improve key metrics, i.e., participation [19], but also help them obtain a sense of achievements [20].Markus et al. [21] investigate how to leverage team competition to improve the cost efficiency in crowdsourcing through a large-scale experimental evaluation.Ai et al. [22] conduct an inter-team contest field experiment on a ride-sharing platform, and find that drivers participated in the team competition works longer hours and earn higher revenue than drivers in control conditions.Ye et al. [23] study how different factors of team completion affect the outcomes of individual drivers in ridesharing based on the result of the online field experiments.With regards to education, the imperative to maintain competitiveness and facilitate the transformation of database management practices has necessitated alignment with the prevailing, cutting-edge tech-nologcal trends within the industry [24].DiNapoli [20] describes the implementation of a pedagogy based on team competition in mathematics classrooms.It shows that team competition could be a useful motivator.Scales et al. [25] conclude that team-based game mechanics can increase resident participation in an online learning platform delivering quality improvement content.They draw the conclusion through a randomized, controlled field experiment.To enhance the effectiveness and quality of experimental teaching, a comprehensive experimental teaching course system that combines artificial intelligence and edge computing technologies is built [26].By deploying edge computing nodes in laboratories or educational settings, experimental data can be transmitted in real-time to edge devices for processing and analysis.Such as students' respective health physique data is integrated into a central cloud platform for more comprehensive data analysis and mining [27].However, to the best of our knowledge, few have analyzed the importance of different characteristics in team competition, particularly in college English teaching.

Individual treatment effect prediction
Predicting individual treatment effects of actions plays a critical role in many domains [28][29][30][31].Synthetic Minority Oversampling TEchnique (SMOTE) technique is used for preprocessing the missing value in the provided input dataset to enhance the prediction accuracy [31].A new Metaheuristic Optimizationbased Feature Subset Selection with an Optimal Deep Learning model (MOFSS-ODL) for predicting students' performance is presented [32].Many researchers propose a variety of algorithms for predicting the individual treatment effect (ITE)based on different techniques, i.e., deep neural networks [33], random forests [34], etc.Others study the application of ITE prediction in different fields, i.e., medicine [34,35], online platforms [36].This work is similar to recent work that predicts ITE in a ride-sharing economy [23].However, this work focuses on the ITE prediction of students' academic performance.Moreover, different machine learning models are adopted to better capture the characteristic in college English teaching.

Experiment setup
To test the impact of team competition on the academic performance of college students, a randomized field experiments among freshmen enrolled in college English course is developed.The authors choose college English course to conduct a classroom experiment for two reasons.First, as part of commonly required courses, college English has a large enrollment in Chinese universities.
The assessment of this course is highly standardized.All students utilized identical course materials, with instructional activities and examinations administered through a unified online platform hosted on the MEC terminal.Therefore, this course structure allows us to split control and treatment groups among classrooms uniformly.Second, the direct link between students' academic performance and scholarships, graduation and post-graduation employment provides motivation for students to do well in college English course.
The sample is made up of freshmen enrolled in college English course taught by the author during the fall semester of the 2021-2022 academic year.Students are excluded with incomplete information, resulting in a final sample of four classes and 180 students.Table 1 shows the descriptive statistics for students in different groups.The first row shows the number of observations in each group.The second row demonstrates the ratio of female students in each group.The ratio of students from Shandong province, where the university located, is shown in the third row.

Team formation
Classrooms were randomized into either a control group or one of three treatment groups, as shown in Fig. 2. In the first treatment group, students are permitted to create teams freely.In the second treatment group, students are assigned to different teams randomly.This group is intended to replicate the most common scenario of team formation in teaching practice.In the third treatment group, students are splitted into different teams according to their academic performance, i.e., the score of English in National College Entrance Examination (NCEE).The Control group uses traditional teaching methods, indicating that no team competition mechanism is introduced in teaching process.All the teams shaped in similar size, covering 6 to 7 regular members.

Contest design
During the contest period, all teams in three treatment groups will engage in team competitions to compete with other teams in the same class.And scores will be rewarded to these teams according to their ranks in the class.The score will contribute to the final score of the course.Besides final exam, the final score of a student also includes performance in teaching activities, i.e., answer race, discussion, quiz and homework.All the activities are conducted on an online teaching platform, and the performance of students are collected automatically.The score of a team is denoted by averaging the final score of all team members.The scores of each

Problem formulation
ITE indicates the effect of team competitions on the academic performance of a student.Difference-in-differences (DID) approach [37] is employed to estimate the ITE.The DID approach first calculate the difference in academic performance before and after team competition for each student; average the performance change in control group, and compute the difference between the two conditions.Formally, given a student set S = S t1 ∪ S t2 ∪ S t3 ∪ S c , where S t1 , S t2 , S t3 and S c indicate students in treatment group 1, treatment group 2, treatment group 3 and control group, respectively.Let S i,T be the academic per- formance of student i in the time period T , T 0 be the baseline period before competition starts, and T 1 be the time period when the competition ends.The difference of student i in academic performance before and after com- petition period can be calculated by And the average performance difference of students in control group can be calculated by Finally, the individual treatment effect of student i can be obtained by Given a student i in team j , let F S i denote the feature list of the student, and F T j represent the features of team j .The problem of predicting the ITE of student i can be formulated by

Feature selection
Based on the theoretical insights from social identity theory and contest theory [37,38], as well as the domain knowledge from college English teaching, the features of a student in this experiment are characterized from two aspects: team features and individual student features. (1)

Team features
According to social identity theory, an individual's social identity is shaped by their membership [39] in specific groups and the emotional significance [40] they attach to those groups.Team features depict the teamlevel characteristics that is related to the behavior of students, such as team formation strategy, team diversity and average performance of a team.In detail, team diversity is indicated by gender diversity and hometown diversity, which are measured by the ratio of female students and students within the province.To depict the performance of a team, all the teammates' Aptis grades are averaged.The performance of a team is a potential significant predictor of ITE.

Individual student features
In contest theory, when studying the behavior of participants in team competition of college teaching, researchers often consider students individual various features or factors that can influence their performance.Individual student features are made up of the demographics, academic performance before the competition [41][42][43], and classroom behaviors [44] of a student.To depict student academic performance before the competition, students' performance in National College Entrance Examination (NCEE) and Aptis test is investigated.In detail, NCEE performance is indicated by overall mark and subject marks.Aptis performance is indicated by the overall score, scores of listening, speaking, reading and writing, and a score for the grammar and vocabulary component.Then authors capture students' classroom behaviors from three aspects: times of participating answer race, scores of quiz and homework.Moreover, student demographics, e.g., gender, hometown and age, are also contained in the set of features.
In this study, a student's ITE is calculated by its Aptis score and the score of final exam.Aptis is an assessment tool which is widely adopted in China.It can help accurately test English language abilities in all four skills, reading, listening, writing and speaking.It is held in every October in our school to assess the English language level of our students.All the freshmen are asked to participate in the exam, which provide us with a fully and accurate evaluation of students' English ability before the competition.The distributions of students' Aptis score in each group are approximately normal, as shown in Fig. 3.
Final exam is conducted at the end of the semester, which includes written and oral test.All the groups use the same test paper and mark by the same teacher.Because the result of oral test may be subjective, only the score of written test is taken to calculate the ITE of a student.The distribution of final exam scores of all the participants in each groups is demonstrated in Fig. 4.

Model implementations
A number of machine learning models can be employed for ITE prediction.Because this study focus on understanding the potential predictors for ITE, only models that can easily interpret the importance of all the influential factors are considered.Here the authors choose four commonly used machine learning methods: extreme gradient boosting (XGBoost) [39,45], light gradient boosting machine (LGBM) [46], Lasso and Ridge.

XGBoost
XGBoost model is used with 100 trees that randomly sample 90 percent of the training data prior to growing trees.The authors choose the dart booster as the XGBoost's booster which can prevent overfitting and improve the model performance.The implementation provided the famous dmlc XGBoost's Python Package with the abovementioned parameters is used to train the model.

LGBM
LGBM model is also used to contrast with other model.The LightGBM model's parameters are similar with XGBoost model, such as booster and subsample.However, 2000 trees are chosen to construct the LGBM model with 0.01 learning rate.As for other parameters, the GridSearchCV algorithm which provided by scikitlearn is used to search the best parameters.Python Package of LGBM is used to build the model.

Lasso and ridge
Both the Lasso and the Ridge are liner models.They are usually used for feature selection.Lasso takes the L1 penalty for both fitting and penalization of the coefficients.Ridge takes the L2 penalty.They all have coefficients for every feature, which visually show correlation between the feature and the target.However, because of the difference of penalty, Lasso would be forces certain coefficients to zero and Ridge would only change the value without changing to zero.The scikit-learn package has also been utilized in this study.Besides, because of the processing of data with Min-max normalization, data is not normalized again and the "cv" parameter is set to 5.

Evaluation
In this section, the effect of team competition on college students' academic performance is analyze by answering the following research questions: RQ1: How dose different machine models perform in ITE prediction?RQ2: Which features are most correlated with students' academic performance when conducting team competition in college classroom?RQ3: How does different competition design methods impact the effect of team competition on students' academic performance?

Performance comparison
Following the standard practice, the dataset is randomized and split it into training set, validation set and test set.The authors adopt RMSE, which is commonly used in measuring the accuracy of a machine learning predictor [41][42][43]: where N indicates the sample size.
The prediction accuracy of the models on both validation set and test set is illustrated in Fig. 5.To test validity (5 of this study, two baselines are constructed.The random baseline retrieve a random value from a Gaussian distribution that is estimated by ITEs in the training set.The average baseline predicts all ITEs in the test set as the mean value of all ITEs in the training set. Figure 5 shows that XGBoost, LGBM, Ridge and Lasso all achieve similar accuracy, demonstrating significant advantage over average and random baselines in RMSE by up to 95% and 30%, respectively.

Analysis of feature importance
XGBoost, LGBM and Lasso can select features in training process.Eliminating characteristics with zero coefficients in Lasso, as well as those with negative importance in XGBoost and LGBM, allows us to identify the most significant predictors for all three models.Note that because of the difference in structure, different models may choose different features, as shown in Fig. 6.
The importance of features is investigated from different ITE prediction models.Figure 6a and b show the selected feature from the Lasso and Ridge models.(see Fig. 6).Surprisingly, average Aptis score of a team is the largest negative factor in both Lasso and Ridge model.The finding is consistent with the relationship between ITE and average team performance, as shown in Fig. 7.
Teams with the highest Aptis score yield smaller treatment effects than teams with low Aptis score.Moreover, the Aptis score of writing, speaking, listening and reading are also negative factors in Lasso and Ridge model.Moreover, they are import features in XGBoost.Their relationships with ITE is consistent with the relationship between the ITE and the average Aptis score of a team, which suggests that students with low academic performance may benefit more from the application of team competition in college English teaching.

Impact of competition design
The way of team formation is a significant predictor in ITE prediction.Figure 8 illustrates ITEs of three treatment groups that form teams in different methods and the ITE of the control group that does not conduct team competition.As shown in Fig. 8, self-formed treatment group obtains the biggest treatment effect.The result is consistent with the conclusion drawn in other domains [23].The reason is that students from self-formed treat groups are usually acquaintances in real life, which may lead to higher level of team identity and responsibility.Grade-balanced treatment group yield smaller treatment effect than self-formed treatment group, but its treatment effect is bigger than the other two groups.The finding provides insights for team formation in scenarios when students are not familiar with each other.Not surprisingly, the treatment effect of control group is approximately to 0. A rather intriguing finding is that random-assigned treatment group obtains the smallest treatment effect, indeed, negative treatment effect.
In addition, the authors also investigate the average discussion times of each group, as shown in Fig. 8b.It can be observed that the number of discussions selfformed treatment group engaged in is the most, and the number of discussion random-assigned treatment group participate in is the least.The number of discussions that grade-balanced group participate is bigger than that of control group, but smaller than that of self-formed treatment group.This is consistent with the average ITE of the four groups.The result shows that self-formed group is more proactive than the other groups, and obtain the biggest individual treatment effect.Moreover, it can also be concluded that introducing team competition into college English teaching may not necessarily have positive effect on students' academic performance, which depends on how team competition is conducted.

Conclusion
In conclusion, this research delved into two crucial realms: the impact of team competition on college students' academic performance and the integration of Machine Learning techniques with MEC terminal data.Through rigorous randomized field experiments among college freshmen, team-related and individual features is meticulously analyzed, employing advanced machine learning models.The findings underscored the significant predictive power of these features on academic performance, enabling a reduction in prediction errors by up to 30%.
Moreover, this study provided valuable insights into the practical application of team competition strategies within college classrooms, offering immediate implications for the teaching design of college English.Team competitions can facilitate mutual learning among students, thus improving their grasp of English language concepts, particularly for those who struggle academically.College administrators are responsible for creating an environment that fosters healthy competition among English teaching teams.This includes providing necessary resources, such as training programs, teaching materials, and MEC technology support.
While this research represents a foundational step, further exploration is essential.Future endeavors will encompass additional field experiments, extending this insights to various courses, and addressing unresolved issues in the intersection of Machine Learning and MEC data processing.This interdisciplinary approach paves the way for enhancing educational methodologies, fostering active student engagement, and advancing the integration of cutting-edge technologies in contemporary learning environments.

Fig. 3 Fig. 4
Fig. 3 Distributions of Aptis overall marks of all participants and three treatment groups Figure 6c and d illustrate the top 15 most important features selected from XGB and LGBM models.The academic performance features of teams and individuals before the competition, e.g., average Aptis score of a team, the overall Aptis score and the Aptis score of four skills, overall score and scores of all subjects in NCEE, show strong predictive power in ITE prediction

Fig. 6
Fig. 6 Importance scores of features

Fig. 7
Fig. 7 Relationship between average performance of a team and ITE

Table 1
Descriptive statics of the sample

Control Treatment 1 Treatment 2 Treatment 3
Experimental design team members and other teams are presented on score board for students to check during the contest period.At the end of the semester, top 5 teams on the score board in each treatment group will be rewarded 5 to 10 extra points to their final score.