Optimizing_Linear_Solvers
How to optimize linear Solvers
To optimize the performance of your solve function in Python, there are several strategies you can employ. I'll break down the code into sections and provide recommendations for optimization. We'll focus on reducing computational overhead, minimizing memory usage, and leveraging more efficient data structures and libraries.
Import Statements
Make sure the imported modules are necessary. Unused imports should be removed.
Extracting Data and Initial Setup
import time
import pandas as pd
import numpy as np
from gurobipy import Env, Model, GRB
from pandas import json_normalize
def solve(data):
t1 = time.time()
opt_id = data['OptimizationId']
blb_id = data['BlobName']
num_steps = data.get('NumberOfSteps', 0)
is_debug = data.get('IsDebug', 0)
loc_d = pd.json_normalize(data, 'Locations', errors='ignore', record_prefix='_')
glo_d = pd.json_normalize(data, 'GlobalIngredients', errors='ignore', record_prefix='_')
solve_type = 1 if num_steps > 0 else 0
# Rest of the code... Use data.get('key', default) instead of checking keys with if 'key' in data. Removed unnecessary re-assignment of t1.
Preprocessing Data Instead of using nested loops and np.append, which are inefficient, use list comprehensions and pd.DataFrame constructor directly.
dummy_nut = {
"NutrientId": "NO_NUT",
"Level": 0.0
}
cols_2_check = ['IngredientId', 'Min', 'Max', 'Cost', 'MinStep', 'MaxStep', 'CostStep', 'Available', 'Global']
loc_ing_nut_data = []
for i in range(len(loc_d)):
loc_id = loc_d['_LocationId'].loc[i]
ingredients = pd.DataFrame(loc_d['_Ingredients'].loc[i])
if len(ingredients.columns) < len(cols_2_check):
ingredients = add_fake_col(cols_2_check, ingredients)
for j in range(len(ingredients)):
ing_row = ingredients.iloc[j]
nutrients = pd.DataFrame(ing_row['NutrientLevels'])
if nutrients.empty:
nutrients = pd.DataFrame([dummy_nut])
for k in range(len(nutrients)):
nutrient_row = nutrients.iloc[k]
loc_ing_nut_data.append([
loc_id,
ing_row['IngredientId'], ing_row['Min'], ing_row['Max'], ing_row['Cost'],
ing_row['MinStep'], ing_row['MaxStep'], ing_row['CostStep'],
ing_row['Available'], ing_row['Global'],
nutrient_row['NutrientId'], nutrient_row['Level']
])
df_loc_ing_nut = pd.DataFrame(loc_ing_nut_data, columns=['LocationId', 'IngredientId', 'Min', 'Max', 'Cost', 'MinStep', 'MaxStep', 'CostStep', 'Available', 'Global', 'NutrientId', 'Level'])
df_loc_ing_nut['Index_LIN'] = range(len(df_loc_ing_nut))
if is_debug:
print('Rip LIN:', round(time.time() - t1, 3), 's')
4. Use pd.concat for Building DataFrames
For loc_spe_ing, loc_spe_nut, and other similar sections, use list comprehensions and pd.concat.
# Example for loc_spe_ing
loc_spe_ing_data = []
for i in range(len(loc_d)):
loc_id = loc_d['_LocationId'].loc[i]
specifications = pd.DataFrame(loc_d['_Specifications'].loc[i])
for j in range(len(specifications)):
spec_row = specifications.iloc[j]
ing_reqs = pd.DataFrame(spec_row['IngredientRequirements'])
if len(ing_reqs.columns) < len(cols_2_check):
ing_reqs = add_fake_col(cols_2_check, ing_reqs)
for k in range(len(ing_reqs)):
req_row = ing_reqs.iloc[k]
loc_spe_ing_data.append([
loc_id, spec_row['SpecificationId'], spec_row['Tons'],
req_row['IngredientId'], req_row['Min'], req_row['Max'],
req_row['MinStep'], req_row['MaxStep'], req_row['FixedLevel']
])
df_loc_spe_ing = pd.DataFrame(loc_spe_ing_data, columns=['LocationId', 'SpecificationId', 'Tons', 'IngredientId', 'Min', 'Max', 'MinStep', 'MaxStep', 'FixedLevel'])
df_loc_spe_ing['Index_LSI'] = range(len(df_loc_spe_ing)) Apply similar changes to loc_spe_nut and other sections where you build DataFrames.
Optimize Constraint Building Instead of repeatedly accessing DataFrame elements within loops, use vectorized operations where possible.
Gurobi Model Optimization Make sure you're setting up the Gurobi model and variables efficiently. Avoid redundant operations and utilize Gurobi's batch methods where possible.
env = Env(params=dict(OutputFlag=0))
CNC = Model("CNC_OPT", env)
x_lsn = CNC.addVars(n_lsn, vtype=GRB.CONTINUOUS, name='x_lsn')
x_lsn_pos = CNC.addVars(n_lsn, lb=0, vtype=GRB.CONTINUOUS, name='x_lsn_pos')
x_lsn_neg = CNC.addVars(n_lsn, lb=0, vtype=GRB.CONTINUOUS, name='x_lsn_neg') Similarly for other variables...
Constraint Building
Use list comprehensions and batch addition of constraints if possible.
CNC.addConstrs((sum_lsn[q_lsn_top] - sum_lsn[q_lsn_bot] * df_lsn['Min'].loc[lsn] + x_lsn_neg[lsn] >= 0 for lsn in range(n_lsn) if df_lsn['IsRatio'].loc[lsn] == 1), name='rat_min_lsn')
CNC.addConstrs((sum_lsn[q_lsn_top] - sum_lsn[q_lsn_bot] * df_lsn['Max'].loc[lsn] - x_lsn_pos[lsn] <= 0 for lsn in range(n_lsn) if df_lsn['IsRatio'].loc[lsn] == 1), name='rat_max_lsn') Clean Up Indexing and DataFrame Operations
Make sure to minimize the number of times you set indices or perform operations that can be batched together.
Optimize Objective Function Construction
Efficiently add the objective function and constraints.
obj_expr = (tot_cost + tot_cost_zero + tot_ls_bad * ls_bad +
tot_i_bad * i_bad + tot_li_bad * li_bad + tot_lsn_bad * lsn_bad)
CNC.setObjective(obj_expr, GRB.MINIMIZE)
Profiling and Fine-tuning Profile the code to identify bottlenecks using tools like cProfile or line_profiler.
Summary
By applying these optimizations, you should see an improvement in the performance of your solve function. Each step focuses on reducing redundancy, leveraging efficient data structures, and using batch operations where possible. After making these changes, run performance tests to measure improvements and further refine the code as needed.
Connect: Join Univrs.io
Last updated