A python based automated tool to validate excel sheets

This is a code sample to illustrate the capability of Python to automate working with excel sheets. Suppose, we are required to work on a set of excel files and to find some data. Normally, we will have to open each file and read the content to find the answers. This can be a tedious task, especially when the number of files are large.

Example scenarios include;

  1. Accounting tasks that are done daily/monthly, that are of repetitive nature
  2. Searching for a set of content in a group of excel files
  3. Validating answer sheets of an entire class and ranking them
  4. Applying a set of calculations to a number of similar files

We will try to implement one of these scenarios in the following code sample. This is an app to find total marks of every student in a class and then find the percentage of marks and then rank them. We are going to use openpyxl library to work on the excel files. This code covers the basic concepts like excel read, write, add rows and columns, apply style, calculations, etc.

Code is split into 2 modules, writeToExcel.py to create an excel sheet using a list of students with marks in various subjects and findRank.py to calculate the total marks, percentage and rank.
For convenience we are keeping data in a json file. We can use any database as well.

Step by step explanation below.

  1. Install Python and openpyxl

    Download and install the latest version of python from Python Website.

    Keep your favorite Python editor ready. This can be a basic notepad application available by default for your OS or you can download and install any free editor like VS Code.

    Open command prompt and navigate to the folder where you are going to place the code and activate virtual environment.

    Install openpyxl using command pip install openpyxl
    Refer openpyxl basics if needed.

    To check if libraries installed successfully, type command pip freeze
    Should give an output like:

    et-xmlfile==1.1.0
    openpyxl==3.0.9
    
  2. Data in Json File

    Create a file named data.json and add the below content. These are the list of students which we are going to write to the excel sheet.

    [
        {"rollno": "101", "name": "Lisa", "maths": 95, "science": 92,  "language": 97},
        {"rollno": "102", "name": "Sam", "maths": 89, "science": 91,  "language": 82},
        {"rollno": "103", "name": "Ted", "maths": 78, "science": 85,  "language": 88},
        {"rollno": "104", "name": "Rob", "maths": 82, "science": 87,  "language": 89},
        {"rollno": "105", "name": "Peter", "maths": 98, "science": 94,  "language": 100},
        {"rollno": "106", "name": "Sara", "maths": 94, "science": 93,  "language": 89},
        {"rollno": "107", "name": "Thomas", "maths": 89, "science": 90,  "language": 91},
        {"rollno": "108", "name": "Kate", "maths": 92, "science": 95,  "language": 88},
        {"rollno": "109", "name": "Tom", "maths": 80, "science": 84,  "language": 85},
        {"rollno": "110", "name": "Allen", "maths": 85, "science": 85,  "language": 90}
    ]
    
  3. Write Data to Excel

    writeToExcel.py create an excel work book with a sheet named Students. Then add column headings in first row and make it bold. Final step is to read json data and append it to the sheet as rows.

    Copied
    from openpyxl import Workbook
    from openpyxl.styles import Font
    import json
    import os
    
    w = Workbook()
    sh = w.active
    sh.title = 'Students'
    sh['A1'], sh['B1'], sh['C1'], sh['D1'], sh['E1'] = 'Roll No', 'Student Name', 'Maths', 'Science', 'Language'
    for hdng in sh["1:1"]:
        hdng.font = Font(bold=True)
    sh.column_dimensions[str(chr(64 + 2))].width = 20
    dataFile = os.path.dirname(__file__) + '\\data.json'
    with open(dataFile) as d:
        data = json.load(d)
        for dt in data:
            sh.append([dt['rollno'], dt['name'], dt['maths'], dt['science'], dt['language']])
    
    w.save('MarksList.xlsx')
    

    Use command python writeToExcel.py or py writeToExcel.py to run the module. If no errors, an excel sheet named MarksList.xlsx with below content will get generated.

    Excel sample
  4. Find Total Marks, Percentage and Rank

    findRank.py reads the excel data and adds three more columns Total, Percentage and Rank to keep the additional data.
    Then loop through each row and calculate total marks and percentage and update those in columns F and G. At the same time store the data in a list to apply the sorting using percentage of marks and find the rank. Then delete existing data and rewrite the new sorted data to excel again.

    Copied
    from openpyxl import Workbook, load_workbook
    from openpyxl.styles import Font
    import os
    
    def sortFunc(srt):
        return srt['per']
    
    w = load_workbook('MarksList.xlsx')
    sh = w.worksheets[0]
    sh['F1'], sh['G1'], sh['H1'] = 'Total', 'Percentage', 'Rank'
    for hdng in sh["1:1"]:
        hdng.font = Font(bold=True)
    sortList = []
    for i, r in enumerate(sh.iter_rows()):
        if(i > 0):
            sh['F' + str(i + 1)] = r[2].value + r[3].value + r[4].value
            sh['G' + str(i + 1)] = round(r[5].value / 3, 2)
            sortList.append({'rno': r[0].value, 'nm': r[1].value, 'mt': r[2].value,
                'sn': r[3].value, 'ln': r[4].value, 'tot': r[5].value, 'per': r[6].value})
    sortList.sort(reverse = True, key = sortFunc)
    
    sh.delete_rows(2, sh.max_row+1)
    
    for rank, itm in enumerate(sortList):
        sh.append([itm['rno'], itm['nm'], itm['mt'], itm['sn'],
            itm['ln'], itm['tot'], itm['per'], rank+1])
    
    w.save('MarksList.xlsx')
    

    On running, below excel sheet will be generated.

    Find rank
Absolute Code Works - Python Topics