How to remove duplicate characters or words in a string

  • This sample shows how to remove duplicate characters or words from a string. Implemented using built-in string methods in Python. Note that this sample is written as case-insensitive.

  • Remove the duplicate words from a string

    First remove the characters that we are not considering using replace function. Then convert string to a list of words using split. By default, split function uses the space to split a string. Then loop through each word and concatenate this to a string, if it is not already present.

    Copied
    teststr = 'This is a test string. This test for the duplicate words in the string.'
    
    #Remove dots
    teststr = teststr.lower().replace(".", "")
    #Convert string to list
    wordsinstr = teststr.split()
    
    #Remove duplicates
    finalstring = ''
    for word in wordsinstr:
        if (not(word in finalstring.split())):
            finalstring = finalstring + word + ' '
    
    
    print(finalstring.strip())
    
    Output:
      this is a test string for the duplicate words in

    Another easy way to do this is by converting the list of words to a set as below. We will get the output as a group of items. Only concern here is that we will not be able to maintain the word order.

    Copied
    teststr = 'This is a test string. This test for the duplicate words in the string.'
    
    #Remove dots
    teststr = teststr.lower().replace(".", "")
    #Convert string to list
    wordsinstr = teststr.split()
    
    finalstring = set(wordsinstr)
    print(finalstring)
    
    Output:
      {'is', 'words', 'this', 'a', 'test', 'string', 'the', 'in', 'duplicate', 'for'}
  • Remove duplicate characters from a string

    You can loop through the characters to find the unique ones. Or convert this to a set as above to get the output in no particular order.

    Copied
    teststr = 'This is a test string'
    
    teststr = teststr.lower().replace(" ", "")
    
    finalstring = ''
    for c in teststr:
        if (not(c in finalstring)):
            finalstring = finalstring + c
    
    print(finalstring)
    
    Output:
      thisaerng
Absolute Code Works - Python Topics