fastgplearn package¶
Subpackages¶
Submodules¶
fastgplearn.gp module¶
- fastgplearn.gp.choice(a, size=None, replace=True, p=None)¶
Generates a random sample from a given 1-D array
New in version 1.7.0.
Note
New code should use the
choicemethod of adefault_rng()instance instead; please see the random-quick-start.- Parameters
a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if it were
np.arange(a)size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g.,
(m, n, k), thenm * n * ksamples are drawn. Default is None, in which case a single value is returned.replace (boolean, optional) – Whether the sample is with or without replacement. Default is True, meaning that a value of
acan be selected multiple times.p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given, the sample assumes a uniform distribution over all entries in
a.
- Returns
samples – The generated random samples
- Return type
single item or ndarray
- Raises
ValueError – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size
See also
randint,shuffle,permutationGenerator.choicewhich should be used in new code
Notes
Setting user-specified probabilities through
puses a more general but less efficient sampler than the default. The general sampler produces a different sample than the optimized sampler even if each element ofpis 1 / len(a).Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its
axiskeyword.Examples
Generate a uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3) array([0, 3, 4]) # random >>> #This is equivalent to np.random.randint(0,5,3)
Generate a non-uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0]) array([3, 3, 0]) # random
Generate a uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False) array([3,1,0]) # random >>> #This is equivalent to np.random.permutation(np.arange(5))[:3]
Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:
>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) array([2, 3, 0]) # random
Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher'] >>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3]) array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random dtype='<U11')
- fastgplearn.gp.crossover(pop_np, p_crossover=0.5)¶
Corssover.
- Parameters
pop_np (np.ndarray) – population
p_crossover (float) – probability for crossover.
- Returns
population with shape (n_pop,2**depth_max).
- Return type
pop (np.ndarray)
- fastgplearn.gp.csub_science(pop, sci_template)¶
This would change the init pop!!!
pyx version for sci substitute.
- fastgplearn.gp.generate_random(func_num, xs_num, pop_size=10, depth_min=1, depth_max=5, p=None, func_p=None, xs_p=None)¶
Generate the first population. Each individual with ordered: [mark,1,2,3,4,5,6,7, 103,102,100,102,102,103,102,100] 1.First part: mark index of root. 2.second part: index of x gene and f gen. 3.Third part: protect index of x gene.
- Parameters
func_num (int) – func number.
xs_num (int) – x number (n_fea).
pop_size (int) – population size.
depth_min (int) – min depth of expression.
depth_max (max) – max depth of expression.
p (None) – (just for test).
func_p (np.ndarray) – with shape of (n_func), probability,.
xs_p (np.ndarray) – with shape of (n_fea), probability.
- Returns
with shape (n_pop,2**depth_max), population .
- Return type
pop (np.ndarray)
- fastgplearn.gp.mutate(mutate_pop, func_num, xs_num, depth_min=1, depth_max=5, p_mutate=0.8, p=None, func_p=None, xs_p=None)¶
Mutate. Each individual with ordered: [mark,1,2,3,4,5,6,7, 103,102,100,102,102,103,102,100] 1.First part: mark index of root. 2.second part: index of x gene and f gen. 3.Third part: protect index of x gene.
- Parameters
func_num (int) – func number.
mutate_pop (np.ndarray) – with shape (n_pop,2**depth_max),population.
xs_num (int) – x number (n_fea).
depth_min (int) – min depth of expression.
depth_max (max) – max depth of expression.
p (None) – (just for test).
func_p (np.ndarray) – with shape of (n_func), probability,.
xs_p (np.ndarray) – with shape of (n_fea), probability.
p_mutate (flaot) – probability for mutate.
- Returns
population with shape (n_pop,2**depth_max).
- Return type
pop (np.ndarray)
- fastgplearn.gp.mutate_random(pop_np, func_num, xs_num, pop_size=10, depth_min=1, depth_max=5, p_mutate=0.8, p=None, func_p=None, xs_p=None)¶
Mutate. Each individual with ordered: [mark,1,2,3,4,5,6,7, 103,102,100,102,102,103,102,100] 1.First part: mark index of root. 2.second part: index of x gene and f gen. 3.Third part: protect index of x gene.
- Parameters
func_num (int) – func number.
pop_size (int) – population size.
pop_np (np.ndarray) – with shape (n_pop,2**depth_max),population.
xs_num (int) – x number (n_fea).
depth_min (int) – min depth of expression.
depth_max (max) – max depth of expression.
p (None) – (just for test).
func_p (np.ndarray) – with shape of (n_func), probability,.
xs_p (np.ndarray) – with shape of (n_fea), probability.
p_mutate (flaot) – probability for mutate.
- Returns
population with shape (n_pop,2**depth_max).
- Return type
pop (np.ndarray)
- fastgplearn.gp.mutate_sci(func_num, xs_num, pop_size=10, depth_min=1, depth_max=5, p=None, func_p=None, xs_p=None, sci_template=None)¶
Mutate. Each individual with ordered: [mark,1,2,3,4,5,6,7, 103,102,100,102,102,103,102,100] 1.First part: mark index of root. 2.second part: index of x gene and f gen. 3.Third part: protect index of x gene.
- Parameters
func_num (int) – func number.
pop_size (int) – population size.
xs_num (int) – x number (n_fea).
depth_min (int) – min depth of expression.
depth_max (max) – max depth of expression.
p (None) – (just for test).
func_p (np.ndarray) – with shape of (n_func), probability,.
xs_p (np.ndarray) – with shape of (n_fea), probability.
sci_template (list of list) – the science expression templates.
- Returns
population with shape (n_pop,2**depth_max).
- Return type
pop (np.ndarray)
- fastgplearn.gp.select_index(score, num_percent=0.3, method='tournament', tour_num=3)¶
Selection.
- Parameters
score (np.ndarray) – score with shape (n_res,)/
num_percent (int,float) – number or percent of population.
method (str) – “tournament” or “k_best”.
tour_num (int) – tournament size .
- Returns
index of selection to population.
- Return type
index (np.ndarray)
- fastgplearn.gp.set_seed(seed)¶
Set random seed.
- fastgplearn.gp.sub_re_hall99(inds, func_num, xs_num)¶
sub the 99 in halls.
- fastgplearn.gp.sub_science(pop, sci_template)¶
This would change the init pop!!! sci substitute.
fastgplearn.sci_formula module¶
fastgplearn.skflow module¶
- class fastgplearn.skflow.SymbolicClassifier(population_size=10000, generations=20, stopping_criteria=0.95, store_of_fame=50, hall_of_fame=3, store=False, p_mutate=0.2, p_crossover=0.5, select_method='tournament', tournament_size=5, device='cpu', sci_template=None, constant_range=None, constants=None, depth=(2, 4), function_set=('add', 'sub', 'mul', 'div', 'pow2', 'pow3'), n_jobs=1, verbose=0, random_state=None, method_backend='p_numpy', func_p=None)¶
Bases:
fastgplearn.skflow.SymbolicEstimatorA Genetic Programming symbolic classifier.
A symbolic classifier is an estimator that begins by building a population of naive random formulas to represent a relationship. The formulas are represented as tree-like structures with mathematical functions being recursively applied to variables and constants. Each successive generation of programs is then evolved from the one that came before it by selecting the fittest individuals from the population to undergo genetic operations such as crossover, mutation or reproduction.
The default score for find expression is accuracy.
Examples:
>>> from fastgplearn.skflow import SymbolicRegressor >>> est_gp = SymbolicRegressor(population_size=5000, ... generations=20, stopping_criteria=0.01, ... p_crossover=0.7, p_mutate_=0.1, ... max_samples=0.9, verbose=1, ... random_state=0) >>> est_gp.fit(X_train, y_train) >>> est_gp.top_n() >>> test_score = est_gp.score(X_test,y_test)
- Parameters
population_size (int) – number of population, default 10000.
generations (int) – number of generations, default 20.
tournament_size (int) – tournament size for selection.
stopping_criteria (float) – criteria of correlation score, max 1.0.
constant_range (tuple) – floats. constant_range=(0,1.0)
constants (tuple) – floats. constants=(-1,1,2,10), if given, The parameter constant_range would be ignored.
depth (tuple) – default (2, 5), The max of depth is not more than 8.
function_set (tuple) – tuple of str. optional: (‘add’, ‘sub’, ‘mul’, ‘div’,”max”, “min”, “ln”, “exp”, “pow2”, “pow3”, “rec”, “sin”, “cos”).
n_jobs (int) – n jobs to parallel.
verbose (bool) – print message.
p_mutate – mutate probability.
p_crossover (float) – crossover probability.
random_state (int) – random state
hall_of_fame (int) – hall of frame number to add to next generation.
store_of_fame (int) – hall of frame number to return result.
method_backend (str) – optional: (“p_numpy”,”c_numpy”,”p_torch”,”c_torch”)
device (str) – default “cpu”, “cuda:0”, only accessible of torch.
func_p (np.ndarray,tuple) – with shape (n_function,), probability values of each function.
sci_template (str,list) – None, “default” or user self-defined list template, default None.
- best_expression(scoring='accuracy')¶
Print the best expression.
- static cla(pre_y)¶
classification tool.
- fit(X: numpy.ndarray, y: numpy.ndarray, xs_p: Optional[numpy.ndarray] = None, x_label=None)¶
Fitting.
- Parameters
X (np.ndarray) – with shape (n_sample,n_fea).
y (np.ndarray) – with shape (n_sample,).
xs_p (np.ndarray) – with shape (n_fea,), probability values of each xi.
x_label (np.ndarray) – with shape (n_fea), names of xi.
- predict(X, y=None, n=0)¶
Return the real predicted y.
- Parameters
X (np.ndarray) – array-like of shape (n_samples, n_features).
vectors (Input) –
features (where n_samples is the number of samples and n_features is the number of) –
y (np.ndarray) – array-like of shape (n_samples,).
n –
- Returns
array-like of shape (n_samples,).
- Return type
y (np.ndarray)
- score(X, y, scoring='accuracy', n=0)¶
Return the mean accuracy on the given test data and labels.
- Parameters
X (np.ndarray) – array-like of shape (n_samples, n_features).
y (np.ndarray) – array-like of shape (n_samples,).
scoring (str) – see also sklearn.metrics.
n (int) – calculate by the n_ed expression.
- Returns
Mean accuracy of
self.predict(X)wrt. y.- Return type
score (float)
- single_coef_logistic(X, y)¶
Fitting by sklearn.linear_model.LogisticRegression.
- top_n(n=0, scoring='accuracy')¶
Print the top n result. The best one is index 0.
- Parameters
scoring (str) – see also sklearn.metrics.
n (int) – calculate by the n_ed expression.
- class fastgplearn.skflow.SymbolicEstimator(population_size=10000, generations=20, stopping_criteria=0.95, store_of_fame=50, hall_of_fame=3, store=False, p_mutate=0.2, p_crossover=0.5, select_method='tournament', tournament_size=5, device='cpu', sci_template=None, constant_range=None, constants=None, depth=(2, 5), function_set=('add', 'sub', 'mul', 'div', 'pow2', 'pow3'), n_jobs=1, verbose=0, random_state=None, method_backend='p_numpy', func_p=None)¶
Bases:
sklearn.base.BaseEstimator,abc.ABC- Parameters
population_size (int) – number of population, default 10000.
generations (int) – number of generations, default 20.
tournament_size (int) – tournament size for selection.
stopping_criteria (float) – criteria of correlation score, max 1.0.
constant_range (tuple) – floats. constant_range=(0,1.0)
constants (tuple) – floats. constants=(-1,1,2,10), if given, The parameter constant_range would be ignored.
depth (tuple) – default (2, 5), The max of depth is not more than 8.
function_set (tuple) – tuple of str. optional: (‘add’, ‘sub’, ‘mul’, ‘div’,”max”, “min”, “ln”, “exp”, “pow2”, “pow3”, “rec”, “sin”, “cos”).
n_jobs (int) – n jobs to parallel.
verbose (bool) – print message.
p_mutate – mutate probability.
p_crossover (float) – crossover probability.
random_state (int) – random state
hall_of_fame (int) – hall of frame number to add to next generation.
store_of_fame (int) – hall of frame number to return result.
method_backend (str) – optional: (“p_numpy”,”c_numpy”,”p_torch”,”c_torch”)
device (str) – default “cpu”, “cuda:0”, only accessible of torch.
func_p (np.ndarray,tuple) – with shape (n_function,), probability values of each function.
sci_template (str,list) – None, “default” or user self-defined list template, default None.
- filter_sci_perset(sci_template)¶
Get the available sci available
- fit(X: numpy.ndarray, y: numpy.ndarray, xs_p: numpy.ndarray = None, x_label=None)¶
Fitting.
- Parameters
X (np.ndarray) – with shape (n_sample,n_fea).
y (np.ndarray) – with shape (n_sample,).
xs_p (np.ndarray) – with shape (n_fea,), probability values of each xi.
x_label (np.ndarray) – with shape (n_fea), names of xi.
- abstract predict(X, y=None, n=0)¶
Return the real predicted y.
- refresh_xcs()¶
Refresh X and constant for each generation.
- refresh_xcs_more()¶
Refresh X and constant for each generation for torch.
- run_gp()¶
Run the GP processing.
- score(X, y, scoring, n=0)¶
Score.
- single_cal(n, new_x=None, with_coef=True)¶
Get the temp predict y of n_ed expression name (without coef and intercept),This is not the final result!
- single_name(n)¶
Get the name of n_ed expression name.
- class fastgplearn.skflow.SymbolicRegressor(population_size=10000, generations=20, stopping_criteria=0.95, store_of_fame=50, hall_of_fame=3, store=False, p_mutate=0.2, p_crossover=0.5, select_method='tournament', tournament_size=5, constant_range=None, constants=None, depth=(2, 4), function_set=('add', 'sub', 'mul', 'div', 'pow2', 'pow3'), sci_template=None, device='cpu', n_jobs=1, verbose=0, random_state=None, method_backend='p_numpy', func_p=None)¶
Bases:
fastgplearn.skflow.SymbolicEstimatorA Genetic Programming symbolic regressor.
A symbolic regressor is an estimator that begins by building a population of naive random formulas to represent a relationship. The formulas are represented as tree-like structures with mathematical functions being recursively applied to variables and constants. Each successive generation of programs is then evolved from the one that came before it by selecting the fittest individuals from the population to undergo genetic operations such as crossover, mutation or reproduction.
The default score for find expression is R (correlation coefficient), Thus this score needs to be further calculated.
Examples:
>>> from fastgplearn.skflow import SymbolicRegressor >>> est_gp = SymbolicRegressor(population_size=5000, ... generations=20, stopping_criteria=0.01, ... p_crossover=0.7, p_mutate_=0.1, ... max_samples=0.9, verbose=1, ... random_state=0) >>> est_gp.fit(X_train, y_train) >>> est_gp.top_n() >>> test_score = est_gp.score(X_test,y_test)
- Parameters
population_size (int) – number of population, default 10000.
generations (int) – number of generations, default 20.
tournament_size (int) – tournament size for selection.
stopping_criteria (float) – criteria of correlation score, max 1.0.
constant_range (tuple) – floats. constant_range=(0,1.0)
constants (tuple) – floats. constants=(-1,1,2,10), if given, The parameter constant_range would be ignored.
depth (tuple) – default (2, 4), The max of depth is not more than 8.
function_set (tuple) – tuple of str. optional: (‘add’, ‘sub’, ‘mul’, ‘div’, “max”, “min”, “ln”, “exp”, “pow2”, “pow3”, “rec”, “sin”, “cos”).
n_jobs (int) – n jobs to parallel.
verbose (bool) – print message.
p_mutate – mutate probability.
p_crossover (float) – crossover probability.
random_state (int) – random state
hall_of_fame (int) – hall of frame number to add to next generation.
store_of_fame (int) – hall of frame number to return result.
method_backend (str) – optional: (“p_numpy”,”c_numpy”,”p_torch”,”c_torch”)
device (str) – default “cpu”, “cuda:0”, only accessible of torch.
func_p (np.ndarray) – with shape (n_function,), probability values of each function.
sci_template (str,list) – None, “default” or user self-defined list template, default None.
- best_expression(scoring='r2')¶
Print the best expression.
- predict(X, y=None, n=0)¶
Return the real predicted y.
- Parameters
X (np.ndarray) – array-like of shape (n_samples, n_features).
vectors (Input) –
features (where n_samples is the number of samples and n_features is the number of) –
y (np.ndarray) – array-like of shape (n_samples,).
n (int) – calculate by the n_ed expression.
- Returns
array-like of shape (n_samples,).
- Return type
y (np.ndarray)
- score(X, y, scoring='r2', n=0)¶
Return the r2 score (default) on the given test data and labels.
- Parameters
X (np.ndarray) – array-like of shape (n_samples, n_features).
y (np.ndarray) – array-like of shape (n_samples,).
scoring (str) – see also sklearn.metrics.
n (int) – calculate by the n_ed expression.
- Returns
Mean r2 of
self.predict(X)wrt. y.- Return type
score (float)
- static single_coef_linear(X, y)¶
Fitting by sklearn.linear_model.LinearRegression.
- top_n(n=0, scoring='r2')¶
Print the top n result. The best one is index 0.
- Parameters
scoring (str) – see also sklearn.metrics.
n (int) – calculate by the n_ed expression.
- fastgplearn.skflow.randint(low, high=None, size=None, dtype=int)¶
Return random integers from low (inclusive) to high (exclusive).
Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high). If high is None (the default), then results are from [0, low).
Note
New code should use the
integersmethod of adefault_rng()instance instead; please see the random-quick-start.- Parameters
low (int or array-like of ints) – Lowest (signed) integers to be drawn from the distribution (unless
high=None, in which case this parameter is one above the highest such integer).high (int or array-like of ints, optional) – If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if
high=None). If array-like, must contain integer valuessize (int or tuple of ints, optional) – Output shape. If the given shape is, e.g.,
(m, n, k), thenm * n * ksamples are drawn. Default is None, in which case a single value is returned.dtype (dtype, optional) –
Desired dtype of the result. Byteorder must be native. The default value is int.
New in version 1.11.0.
- Returns
out – size-shaped array of random integers from the appropriate distribution, or a single such random int if size not provided.
- Return type
int or ndarray of ints
See also
random_integerssimilar to randint, only for the closed interval [low, high], and 1 is the lowest value if high is omitted.
Generator.integerswhich should be used for new code.
Examples
>>> np.random.randint(2, size=10) array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) # random >>> np.random.randint(1, size=10) array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Generate a 2 x 4 array of ints between 0 and 4, inclusive:
>>> np.random.randint(5, size=(2, 4)) array([[4, 0, 2, 1], # random [3, 2, 2, 0]])
Generate a 1 x 3 array with 3 different upper bounds
>>> np.random.randint(1, [3, 5, 10]) array([2, 2, 9]) # random
Generate a 1 by 3 array with 3 different lower bounds
>>> np.random.randint([1, 5, 7], 10) array([9, 8, 7]) # random
Generate a 2 by 4 array using broadcasting with dtype of uint8
>>> np.random.randint([1, 3, 5, 7], [[10], [20]], dtype=np.uint8) array([[ 8, 6, 9, 7], # random [ 1, 16, 9, 12]], dtype=uint8)
fastgplearn.tools module¶
- class fastgplearn.tools.Hall(size=10)¶
Bases:
objectHall of Frame.
Examples:
>>> hall = Hall(size=50) >>> hall.update(inds, gen_i, score, consts) >>> hall[i]
- best_constant()¶
Return the best individual’s constant for next generation.
- change0()¶
Change the unused constants to 0.
- sort_and_hash()¶
Remove the repeat result, (Imperfect guarantee,due to the different individuals could be with same expression).
- top_n(n)¶
Return the top n result.
- update(inds, gen_i, score, consts)¶
Add individual.
- class fastgplearn.tools.Logs(head_msg='')¶
Bases:
objectLog the message.
Examples:
>>> log = Logs() >>> log.record("score:0.9") >>> log.print(log) >>> "score:0.9"
- print(head=False, row=True)¶
- prints(row=True)¶
- record(msg)¶
- record_and_print(msg, row=False)¶
- records(msg)¶
- fastgplearn.tools.find_add_mask_all_merge(pop, single_start=6)¶