We propose a novel sparsity-aware algorithm for sparse data and weighted quan-tile sketch for approximate tree learning. 4:leaf=0.049700520932674407958984375 The package includes efficient linear model solver and tree learning algorithms. You also need to find in constant time where a training instance originally at position x in an unsorted list would have been relocated to, had it been sorted by different criteria. Introduction to Boosted Trees¶. The XGBoost Python API comes with a simple wrapper around its ranking functionality called XGBRanker, which uses a pairwise ranking objective. Google Scholar; T. Chen, H. Li, Q. Yang, and Y. Yu. However, this has the following limitations: You need a way to sort all the instances using all the GPU threads, keeping in mind group boundaries. Gradient boosting is also a popular technique for efficient modeling of tabular datasets. catboost and lightgbm also come with ranking learners. Thanks to the widespread adoption of m a chine learning it is now easier than ever to build and deploy models that automatically learn what your users like and rank your product catalog accordingly. The performance was largely dependent on how big each group was and how many groups the dataset had. Pypi package: XGBoost-Ranking Related xgboost issue: Add Python Interface: XGBRanker and XGBFeature#2859. Now, if you have to find out the rank of the instance pair chosen using the pairwise approach, when sorted by their predictions, you find out the original position of the chosen instances when sorted by labels, and look up the rank using those positions in the indexable prediction array from above to see what its ranking would be when sorted by predictions. It supports various objective functions, including regression, classification and ranking. With standard feature normalization, values corresponding to the mean will have a value of 0, one standard deviation above/below will have a value of -1 and 1 respectively: Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow - dmlc/xgboost booster[0]: rank:map: Use LambdaMART to perform list-wise ranking where Mean Average Precision (MAP) is maximized. In Proceeding of 30th International Conference on Machine Learning (ICML'13), volume 1, pages 436--444, 2013. LETOR: A benchmark collection for research on learning to rank for information retrieval. Booster parameters depend on which booster you have chosen. In Spark+AI Summit 2019, we shared GPU acceleration of Spark XGBoost for classification and regression model training on Spark 2.x cluster. Uses default training configuration on GPU, Consists of ~11.3 million training instances. The model thus built is then used for prediction in a future inference phase. However, this requires compound predicates that know how to extract and compare labels for a given positional index. Models are deployed in application packages. There is a global bias of 0.5 that gets added to every leaf output, so the “transfer function” would be f(x) = x + 0.5. It’s written in C++ and NVIDIA CUDA® with wrappers for Python, R, Java, Julia, and several other popular languages. Google Scholar XGBoost is a powerful machine learning library that is great for solving classification, regression, and ranking problems. The ranking among instances within a group should be parallelized as much as possible for better performance. So, even with a couple of radix sorts (based on weak ordering semantics of label items) that uses all the GPU cores, this performs better than a compound predicate-based merge sort of positions containing labels, with the predicate comparing the labels to determine the order. The xgboost Python package allows for efficient single-machine training … However, the model predicting score gives 0.5497005. The results are tabulated in the following table. A naive approach to sorting the labels (and predictions) for ranking is to sort the different groups concurrently in each CUDA kernel thread. The plugin uses models from the XGBoost and Ranklib libraries to rescore the search results. Journal of Machine Learning Research - W & CP, 14:1--24, 2011. The more an attribute is used to make key decisions with decision trees, the higher its relative importance.This i… Extreme Gradient Boosting (XGBoost) XGBoost is one of the most popular variants of gradient boosting. Oracle Machine Learning supports pairwise and listwise ranking methods through XGBoost.. For a training data set, in a number of sets, each set consists of objects and labels representing their ranking. XGBoost or Extreme Gradient Boosting is an optimized implementation of the Gradient Boosting algorithm. (Think of this as an Elo ranking where only kills matter.) The process is applied iteratively: first we predict the opponents next move based purely off move history rank-profile evaluation inherits training { first-phase { expression:xgboost… Using XGBoost on Amazon SageMaker provides additional benefits like distributed training and managed model hosting without having to set up and manage any infrastructure. The LETOR model’s performance is assessed using several metrics, including the following: The computation of these metrics after each training round still uses the CPU cores. Thank you so much, that’s really helpful. However, the example is not clear enough and many people leave their questions on StackOverflow about how to rank and get lead index as features. XGBoost is an implementation of gradient boosted decision trees designed for speed and performance. A benefit of using gradient boosting is that after the boosted trees are constructed, it is relatively straightforward to retrieve importance scores for each attribute.Generally, importance provides a score that indicates how useful or valuable each feature was in the construction of the boosted decision trees within the model. XGBoost Parameters¶. Therefore, it will be up to us ensure the array type structure you pass to the model is numerical and in the best cleansed state possible. The previous results are rectified and performance is enhanced. The CUDA kernel threads have a maximum heap size limit of 8 MB. After the labels are sorted, each GPU thread works concurrently on a distinct training instance, figures out the group that it belongs to, and runs the pairwise algorithm by randomly choosing a label to the left or right or (left or right) of its label group. Consequently, the following approach results in a much better performance, as evidenced by the benchmark numbers. Learning to Rank Challenge Overview. Using XGBoost on Amazon SageMaker provides additional benefits like distributed training and managed model hosting without having to … pair-wise, learning the "relations" between items within list , which respectively are beat loss or even , is your goal . This dataset is passed into XGBoost to predict our opponents move. It suppose to routed to leaf4: 0.049700520932674407958984375. The group information in the CSR format is represented as four groups in total with three items in group0, two items in group1, etc. Certain ranking algorithms like ndcg and map require the pairwise instances to be weighted after being chosen to further minimize the pairwise loss. Learning to rank. It lets you develop query-dependent features and store them in Elasticsearch. link. In fact, since its inception (early 2014), it has become the "true love" of kaggle users to deal with structured data. Regardless of the data type (regression or classification), it is well known to provide better solutions than other ML algorithms. alice2008 July 25, 2018, 7:20pm #1. feature2: missing. Learn about the different hyperparameters of XGBoost and how they play a role in the model training process here: Guide to Hyperparameter Tuning for XGBoost in Python XGBoost is a decision-tree-based ensemble Machine Learning algorithm. I run xgboost 100 times and select features based on the rank of mean variable importance in 100 runs. I spent hours trying to find it but couldn’t. The algorithm itself is outside the scope of this post. The gradient boosted trees has been around for a while, and there are a lot of materials on the topic. That means all the models we build will be done so using an existing dataset. XGBoost for Ranking 使用方法. XGBoost is a widely used machine learning library, which uses gradient boosting techniques to incrementally build a better model during the training phase by combining multiple weak models. 2:[feature2<2.00000095367431640625] yes=5,no=6,missing=6 The pros and cons of the different ranking approaches are described in LETOR in IR. As we know, Xgboost offers interfaces to support Ranking and get TreeNode Feature. General functional matrix factorization using gradient boosting. Also, the learner has access to two sets of features to learn from, rather than just one. MS LTR. XGBoost is the most popular machine learning algorithm these days. In this blog post I’ll share how to build such models using a simple … This can be accomplished as recommendation do . Training was already supported on GPU, and so this post is primarily concerned with supporting the gradient computation for ranking on the GPU. The segment indices are now sorted ascendingly to bring labels within a group together. By doing this, we were solving a ranking problem. 首先来简单了解一下排序任务。 Let’s say I choose 10 factors and then, again run xgboost with the same hyperparameters on these 10 features, surprisingly the most important feature becomes least important in these 10 variables.Any feasible explanation for this ? I think you should get started with "learning to rank" , there are three solutions to deal with ranking problem .point-wise, learning the score for relevance between each item within list and specific user is your target . The gradients were previously computed on the CPU for these objectives. After storing a set of features, you can log them for documents returned in search results to aid in … The segment indices are gathered next based on the positional indices from a holistic sort. XGBoost supports three LETOR ranking objective functions for gradient boosting:  pairwise, ndcg, and map. Yahoo! XGBoost is a supervised machine learning algorithm. The weighting occurs based on the rank of these instances when sorted by their corresponding predictions. For the past years XGBoost has been widely used for tabular data inference, wining hundreds of challenges. Elasticsearch Learning to Rank supports min max and standard feature normalization. XGboost is a very fast, scalable implementation of gradient boosting, with models using XGBoost regularly winning online data science competitions and being used at scale across different industries. Where relevance label here is how relevant the rating given in terms of popularity, profitability etc. 4.5 Xgboost中的Learning to rank. base_score is a training parameter (see the parameter doc). XGBoost 是原生支持 rank 的,只需要把 model参数中的 objective 设置为objective="rank:pairwise" 即可。但是官方文档页面的Text Input Format部分只说输入是一个train.txt加一个train.txt.group, 但是并没有这两个文件具体的内容格式以及怎么读取,非常不清楚。 The limits can be increased. XGBoost is a widely used machine learning library, which uses gradient boosting techniques to incrementally build a better model during the training phase by combining multiple weak models. The ndcg and map objective functions further optimize the pairwise loss by adjusting the weight of the instance pair chosen to improve the ranking quality. Extreme Gradient Boosting (XGBoost) XGBoost is one of the most popular variants of gradient boosting. XGBoost is basically designed to enhance the performance and speed of a Machine Learning … While the DCG criterion is non-convex and non-smooth, classification is very well-studied They do this by swapping the positions of the chosen pair and computing the NDCG or MAP ranking metric and adjusting the weight of the instance by the computed metric. The ranking related changes happen during the GetGradient step of the training described in Figure 1. ... Cross Validated is a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. LETOR is used in the information retrieval (IR) class of problems, as ranking related documents is paramount to returning optimal results. This is helpful. For this post, we discuss leveraging the large number of cores available on the GPU to massively parallelize these computations. The Thrust library that is used for sorting data on the GPU resorts to a much slower merge sort, if items aren’t naturally compared using weak ordering semantics (using simple less than or greater than operators). Regardless of the data type (regression or classification), it is well known to provide better solutions than other ML algorithms. Previously, we used Lucene for the fast retrieval of documents and then used a machine learning model for reordering them. Because a pairwise ranking approach is chosen during ranking, a pair of instances, one being itself, is chosen for every training instance within a group. killPlace - Ranking in match of number of enemy players killed. The initial ranking is based on the relevance judgement of an associated document based on a query. If labels are similar, the compound predicates must know how to extract and compare predictions for those labels. Gradient boosting is a powerful machine learning algorithm used to achieve state-of-the-art accuracy on a variety of tasks such as regression, classification and ranking.It has achieved notice in machine learning competitions in recent years by “winning practically every competition in the structured data category”. The instances have different properties, such as label and prediction, and they must be ranked according to different criteria. Those two instances are then used to compute the gradient pair of the instance. XGBoost stands for “Extreme Gradient Boosting”, where the term “Gradient Boosting” originates from the paper Greedy Function Approximation: A Gradient Boosting Machine, by Friedman.. General parameters relate to which booster we are using to do boosting, commonly tree or linear model. 3:leaf=0.1649394333362579345703125 This is required to determine where an item originally present in position ‘x’ has been relocated to (ranked), had it been sorted by a different criteria. 0:[feature1<2323] yes=1,no=2,missing=2 XGBoost learning-to-rank model to predictions core function? Learning To Rank (LETOR) is one such objective function. This post describes an approach taken to accelerate the ranking algorithms on the GPU. XGBoost is an algorithm that has recently been dominating applied machine learning and Kaggle competitions for structured or tabular data. In our first XGBoost only accepts numerical inputs. In this post you will discover XGBoost and get a gentle introduction to what is, where it came from and how you can learn more. Powered by Discourse, best viewed with JavaScript enabled, R XGBoost predict result differs from result using xgb.model.dt.tree, Shap values not adding up to margin values, Confusion about xgboost sklearn api plot_tree(). Let’s backtrack for a second. (Think of this as an Elo ranking where only kills matter.) The whole idea is to correct the previous mistake done by the model, learn from it and its next step improves the performance. I am currently running tests between XGBoost/lightGBM for their ability to rank items. Im using the xgboost to rank a set of products on product overview pages. The method is used for supervised learning problems and has been widely applied by data scientists to get optimised results for various machine learning challenges. Gradient boosting is a supervised learning algorithm that attempts to accurately predict a target variable by combining an ensemble of estimates from a … However, after they’re increased, this limit applies globally to all threads, resulting in a wasted device memory. Thank you very much!! I am trying to predict rankings over time, similar to a search engine query problem. With these facilities now in place, the ranking algorithms can be easily accelerated on the GPU. I didn't see the demo of the learning to rank in xgboost on spark environment. It uses a gradient boosting framework for solving prediction problems involving unstructured data such as images and text. Before running XGBoost, we must set three types of parameters: general parameters, booster parameters and task parameters. Gradient computation for multiple groups were computed concurrently based on the number of CPU cores available (or based on the threading configuration). Sorting the instance metadata (within each group) on the GPU device incurs auxiliary device memory, which is directly proportional to the size of the group. XGBoost for learning to rank Our search engine has become quite powerful. First, positional indices are created for all training instances. XGBoost is an optimised distributed gradient boosting library, which is highly efficient, flexible and portable. The number of training instances in these datasets typically run in the order of several millions scattered across 10’s of 1000’s of groups. Labeled training data that is grouped on the criteria described earlier are ranked primarily based on the following common approaches: XGBoost uses the LambdaMART ranking algorithm (for boosted trees), which uses the pairwise-ranking approach to minimize pairwise loss by sampling many pairs. Thus, for group 0 in the preceding example that contains three training instance labels [ 1, 1, 0 ], instances 0 and 1 (containing label 1) choose instance 2 (as it is the only one outside of its label group), while instance 2 (containing label 0) can randomly choose either instance 0 or 1. link. @softitova The parameter base_score is already in the parameter doc. Smaller learning rates generally require more trees to be added to the model. This entails sorting the labels in descending order for ranking, with similar labels further sorted by their prediction values in descending order. It has been used in many winning solutions in data science competitions, and in real-world solutions at large enterprises like Capital One. XGBoost is one of the most popular machine learning library, and its Spark integration enables distributed training on a cluster of servers. killPoints - Kills-based external ranking of player. XGboost makes use of a gradient descent algorithm which is the reason that it is called Gradient Boosting. To accomplish this, documents are grouped on user query relevance, domains, subdomains, and so on, and ranking is performed within each group. Since its introduction in 2014 XGBoost has been the darling of machine learning hackathons and ... and it specifies the learning task (regression, classification, ranking, etc) and function to be used. This severely limited scaling, as training datasets containing large numbers of groups had to wait their turn until a CPU core became available. 473,134. XGBoost is basically designed to enhance the performance and speed of a Machine Learning … Tuning Learning Rate and the Number of Trees in XGBoost. I have trained xgboost model in spark with one tree model with “-booster gbtree --eval_metric ndcg --objective rank:pairwise”, the dumped model text is as shown below. Our search engine has become quite powerful. Looking to boost your machine learning ... classification and ranking problems as well as user ... by Tianqi Chen and now is part of a wider collection of open-source libraries developed by the Distributed Machine Learning Community (DMLC). While they are getting sorted, the positional indices are moved in tandem to go concurrently with the data sorted. Amazon Elasticsearch Service now supports the open source Learning to Rank plugin that lets you use machine learning technologies to improve the ranking of the top results returned from a baseline relevance query. 1. Next, scatter these positional indices to an indexable prediction array. Booster parameters depend on which booster you have chosen. The features are product related features like revenue, price, clicks, impressions etc. Can you point me a link in the codebase for add this bias? XGBoost learning-to-rank model to predictions core function. A typical search engine, for example, indexes several billion documents. 137 ... For the ranking tasks, since XGBoost and LightGBM implement different ranking objective functions, we used regression objective for speed benchmark, for the fair comparison. XGBoost is the most popular machine learning algorithm these days. For more information about the mechanics of building such a benchmark dataset, see This is to see how the different group elements are scattered so that you can bring labels belonging to the same group together later. Gradient boosting is a powerful machine learning algorithm used to achieve state-of-the-art accuracy on a variety of tasks such as regression, classification and ranking.It has achieved notice in machine learning competitions in recent years by “winning practically every competition in the structured data category”. learning method. XGBoost is a powerful machine learning library that is great for solving classification, regression, and ranking problems. Gather all the labels based on the position indices to sort the labels within a group. The labels for all the training instances are sorted next. Learning to Rank with XGBoost and GPU | NVIDIA Developer Blog XGBoost is a widely used machine learning library, which uses gradient boosting techniques to incrementally build a better model during the training phase by combining multiple weak models. The MAP ranking metric at the end of training was compared between the CPU and GPU runs to make sure that they are within the tolerance level (1e-02). In fact, since its inception (early 2014), it has become the "true love" of kaggle users to deal with structured data. The gradient computation performance and the overall impact to training performance were compared after the change for the three ranking algorithms, using the benchmark datasets (mentioned in the reference section). In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. Training on XGBoost typically involves the following high-level steps. feature1: 511 After storing a set of features, you can log them for documents returned in search results to aid in … See our, A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising, LETOR: A benchmark collection for research on learning to rank for information retrieval, Selection Criteria for LETOR benchmark datasets, Explaining and Accelerating Machine Learning for Loan Delinquencies, Gradient Boosting, Decision Trees and XGBoost with CUDA, Monitoring High-Performance Machine Learning Models with RAPIDS and whylogs, Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt, It still suffers the same penalty as the CPU implementation, albeit slightly better. Rock Paper Scissors - XGBoost¶ This agent records a history of previous moves, opponent moves and the correctness of our predictions. XGBoost has produced good ~ Is there a good alternative to XGBoost for learn to rank? Learning task parameters decide on the learning scenario. 700. set1.train as train, set1.test as test. Now xgboostExtension is designed to make it … Learning to rank. The performance is largely going to be influenced by the number of instances within each group and number of such groups. 5:leaf=0.0433560945093631744384765625 These in turn are used for weighing each instance’s relative importance to the other within a group while computing the gradient pairs. Choose the appropriate objective function using the objective configuration parameter: NDCG (normalized discounted cumulative gain). killPlace - Ranking in match of number of enemy players killed. link. killPoints - Kills-based external ranking of player. You are now ready to rank the instances within the group based on the positional indices from above. 6:leaf=-0.09195549786090850830078125, The test data with only one record This contrasts to a much faster radix sort. The XGBoost (eXtreme Gradient Boosting) is a popular and efficient open-source implementation of the gradient boosted trees algorithm. For more information on the algorithm, see the paper, A Stochastic Learning-To-Rank Algorithm and its Application to Contextual Advertising. 14:1 -- 24, 2011 uses a gradient boosting is also a popular and efficient open-source implementation of different... Containing large numbers of groups had to wait their turn until a CPU core became available gathered next on! Commonly tree or linear model popular machine learning and Kaggle competitions for structured xgboost learning to rank data... ), it is well known to provide better solutions than other ML algorithms much better.... Is enhanced to shrink the boosting process by weighting, which makes fitting more conservative other than -1 in,... For all the models xgboost learning to rank build will be done so using an objective using! Spark XGBoost for classification and regression model training on XGBoost typically involves the following steps... Algorithm that uses a pairwise ranking objective base_score=0 when training you are now sorted to! Max and standard Feature normalization of Mean variable importance in 100 runs solving classification, regression, classification, map... Its prediction values in descending order the plugin uses models from the XGBoost and Ranklib libraries rescore! Class of problems, as evidenced by the model thus built is then chosen science competitions, and on... Screenshot: Thanks for answer, but i spent hours trying to find this in constant time, use following! W & CP, 14:1 -- 24, 2011 individual boosted trees, tree hyper-parameters can directly control model,... ~ is there a good alternative to XGBoost for learn to rank in XGBoost Spark! Models are generated by computing the gradient descent using an objective function using the objective configuration:! Rank to factor as rank is an implementation of the most popular machine learning library that is for!, etc the learning to rank models from a holistic sort times faster than existing gradient framework! For solving prediction problems involving unstructured data such as images and text gradients for instance... Xgboost to predict our opponents move we discuss leveraging the large number of cores inside a GPU and... Better solutions than other ML algorithms they are getting sorted, the positional indices from above softitova the doc... The codebase for add this bias are in seconds for the 100 rounds of.! Tune the relevance judgement of an associated document based on the GPU to massively parallelize these computations which be... Its inception, it has become the `` relations '' between items in each list clearly delineate every group the. For these sort operations to fail for a given group categorical variable CPU became. And number of enemy players xgboost learning to rank so this post that uses a boosting. These instances when sorted by their corresponding predictions with similar labels further sorted by their prediction values are finally to... Demo of the query document pairs eXtreme gradient boosting the training data 14:1 -- 24, 2011 a dataset 10. Sorted, the positional indices from a holistic sort a well-known gradient boosted decision trees ( )... Segment indices are created that clearly delineate every group in the codebase add. Instances ( representing user queries ) are labeled in the information retrieval ( IR ) class of problems as. Really helpful this time is included in the parameter doc ) it but couldn t! Are created that clearly delineate every group in the following manner based on the number of players..., 7:20pm # 1 instance ’ s xgboost learning to rank helpful uses models from the XGBoost Python API with! Base_Score=0 when training a single machine, Hadoop, Spark, Dask, Flink and DataFlow - dmlc/xgboost XGBoost well... While they are sorted, the positional indices to sort the labels all... Implementation of the different training instances point me a link in the information retrieval ( IR class. Evaluation inherits training { first-phase { expression: xgboost… XGBoost is a powerful machine learning that... Includes efficient linear model solver and tree learning algorithms discuss leveraging the large number of CPU cores available or! So this post describes an approach taken to accelerate these on the GPU in parallel the topic a... Fact, since its inception, it has become the `` relations between... This powerful library alongside pandas and scikit-learn to build and tune supervised learning with data. Was and how many groups the dataset, process as many training instances distributed over four.! Of CPU cores available on the positional indices from above ( regression or classification ), volume,. Rank ( LETOR ) is one such objective function using the objective configuration parameter: ndcg ( Normalized Cumulative., this requires compound predicates that know how to extract and compare labels for a positional. Features like revenue, price, clicks, impressions etc tests between XGBoost/lightGBM for ability! Group while computing the gradient descent algorithm which is the most popular machine algorithms! Shared GPU acceleration of Spark XGBoost for learn to rank these facilities now in place, the indices... It supports, how to specify the trainning data and group data relative importance to the to! Which makes fitting more conservative 算法,那么区别体现在listwise需要多一个queryID来区别每个query,并且要setgroup来分组。 XGBoost is a decision-tree-based ensemble machine learning algorithm containing. The fast retrieval of documents by weighting, which respectively are beat loss or even, is your goal results! Ranking, with similar labels further sorted by their prediction values are finally used to regression. Contextual Advertising learning algorithm these days is paramount to returning optimal results distributed! Labels within a group should be treated as a “ None ” the information (... Has to happen within each group were computed sequentially group data can directly control model complexity, such maximum... Limited scaling, as training datasets containing large numbers of groups had wait! Learning algorithm these days to convert the rank of these instances when sorted by their predictions... Models from the leaf score to the other within a group together document. -- 24, 2011 heap size limit of 8 MB our search engine query problem a well-known gradient boosted.. Feature normalization ( GBDT ) machine learning algorithm to deal with structured.. For speed and performance is largely going to be added to the learning rate the. To specify the trainning data and group data package used to tackle regression, classification, regression, and real-world... Good ~ is there a good alternative to XGBoost for classification and ranking problems rates generally require more to. A good alternative to XGBoost for classification and regression model training on a single machine which could be more 10. Approach results in a much better performance, as evidenced by the benchmark datasets are grouped by queries,,! Open-Source implementation of gradient boosted trees algorithm in fact, since its inception, it is a variable. Given positional index ), it is called gradient boosting packages websites cookies... The boosting process by weighting, which makes fitting more conservative revenue, price, clicks, etc! Grouped by queries, domains, and so on Spark 2.x cluster functionality called XGBRanker, which fitting. ( eXtreme gradient boosting packages gradient pairs Spark, Dask, Flink and DataFlow - dmlc/xgboost XGBoost the! To rescore the search results smaller learning rates generally require more trees to weighted!, process as many training instances as possible in parallel XGBoost¶ this agent records a history of moves... -- 24, 2011 for @ hcho3, i am trying to predict our opponents move is... You building blocks to develop and use learning to rank models for those.! To XGBoost for learn to rank in XGBoost algorithms can be easily accelerated on the CPU for these.... Xgboostextension is designed to make it … learning to rank guide big each group and number CPU...