PHP - Word Density Counter Function

At one point in time, prior to converting to WordPress, I wrote a word density counter function to print out the number of times a word was used and what percentage of the whole document that was.

The $min_word_char variable is to remove words less than that character count. The default will remove all single character words from the count. (There was a good reason for this; not sure what it was.)

The excluded words are not part of the words counted, but are part of the total word count. (This was done at the client's request.)

      
        <?php

        function calculate_word_density( $string, $min_word_char = 2, $exclude_words = array() ) {
          // remove all html and php tags from the text
          $string = strip_tags($string);

          //convert all text to lowercase
          $string = mb_strtolower($string);

          // get an array containing all the words found inside $string
          $initial_words_array  =  str_word_count($string, 1);

          // count the size of the array to get the total words
          $total_words = sizeof($initial_words_array);

          // replace excluded words with blank
          $new_string = $string;
          foreach( $exclude_words as $filter_word ) {
            $new_string = preg_replace("/\b".$filter_word."\b/i", "", $new_string);
          }

          // get an array without the excluded words
          $words_array = str_word_count($new_string, 1);

          // verify that all the words are >= the minimum word character length
          $words_array = array_filter($words_array,
            create_function('$var', 'return (strlen($var) >= '.$min_word_char.');')
          );

          // remove any duplicate words from the array
          $unique_words_array = array_unique($words_array);

          $density = array();
          foreach( $unique_words_array as $key => $word ) {
            preg_match_all('/\b'.$word.'\b/i', $string, $out);
            $count = count($out[0]);
            $percent = number_format((($count * 100) / $total_words), 2);
            $density[$key]['word'] = $word;
            $density[$key]['count'] = $count;
            $density[$key]['percent'] = $percent.'%';
          }
          function cmp( $a, $b ) {
            return ($a['count'] > $b['count']) ? +1 : -1;
          }
          usort($density, "cmp");
          return $density;
        }

        ?>
      
    

PHP Function Links:

strip_tags()mb_strtolower()str_word_count()sizeof()preg_replace()array_filter()create_function()strlen()array_unique()preg_match_all()count()number_format()usort()