• Categories
    Category
    {{ postCtrl.tags }}
    • {{ category.tag_type }}

      • {{tag.tag_name}}
      • View more
  • Categories
    Category
    {{ postCtrl.tags }}
    • {{ category.tag_type }}

      • {{tag.tag_name}}
      • View more
  • News
  • Tutorials
  • Forums
  • Tags
  • Users
Tutorial News Comments FAQ Related Articles

How To Play with Word and Character Counts in Linux

{{postValue.id}}

Play with Word and Character Counts in Linux terminal

wc( word count) command prints newline, word and byte counts from file. This article explains how to play with word and character count in Linux terminal.

To analyze text file

Let’ s take the samba configuration file smb.conf for testing purpose.

[root@linuxhelp ~]# cd /etc/samba/
[root@linuxhelp samba]# ls
lmhosts  smb.conf

To view the repeated words and frequency in the smb.conf file.

[root@linuxhelp samba]# cat smb.conf | tr '  '   ' 12'  | tr ' [:upper:]'  ' [:lower:]'  | tr -d ' [:punct:]'  | grep -v ' [^a-z]'  | sort | uniq -c | sort -rn | head
    363 
     86 the
     66 to
     30 a
     22 samba
     21 on
     21 for
     20 yes
     20 is
     18 this

This command is used to create text file man.txt with manual page content for using man command.

$ fold -w1 <  man.txt | tr ' [:lower:]'  ' [:upper:]'  | sort | tr -d ' [:punct:]'  | uniq -c | sort -rn | head -20

The following command helps you to break down words individually.

[root@linuxhelp samba]# echo ' linuxhelp'  | fold -w1
l
i
n
u
x
h
e
l
p

-w1 is used for width

To sort the result and get the output with frequency, use the following command.

[root@linuxhelp samba]# fold -w1 <  smb.conf | sort | uniq -c | sort -rn | head
   1636  
    887 e
    682 o
    663 t
    646 s
    615 a
    531 -
    523 i
    519 r
    496 n

Get frequent characters in text file with uppercase and lowercase by using the following command.

[root@linuxhelp samba]# fold -w1 <  smb.conf | sort | tr ' [:lower:]'  ' [:upper:]'  | uniq -c | sort -rn | head -20
   1636  
    903 E
    714 S
    702 O
    699 T
    620 A
    545 N
    539 I
    533 R
    531 -
    386 L
    285 M
    276 D
    260 H
    259 C
    238 U
    234 P
    224 =
    211 B
    210 #

To strip out punctuation, use tr command.

[root@linuxhelp samba]# fold -w1 <  smb.conf | tr ' [:lower:]'  ' [:upper:]'  | sort | tr -d ' [:punct:]'  | uniq -c | sort -rn | head -20
   1636  
   1221 
    903 E
    714 S
    702 O
    699 T
    620 A
    545 N
    539 I
    533 R
    386 L
    285 M
    276 D
    261 
    260 H
    259 C
    238 U
    234 P
    211 B
    140 W

Run the above script in one line to view the output

[root@linuxhelp samba]# cat smb.conf | tr ' '  ' 12'  | tr ' [:upper:]'  ' [:lower:]'  | tr -d ' [:punct:]'  | tr -d ' [0-9]'  | sort | uniq -c | sort -n |  grep -E ' ..................'  | head
      1     add group script  usrsbingroupadd g
      1     add machine script  usrsbinuseradd n c workstation u m d nohome s binfalse u
      1     add user script  usrsbinuseradd u n g users
      1  and groupadd family of binaries run the following command as the root user to
      1  a pershare basis
      1  apply the correct selinux labels to these files
      1  a publicly accessible directory that is read only except for users in the
      1  argument list can include mypdcname mybdcname and mynextbdcname
      1  boolean on
1    browser control options

Tags:
connor
Author: 

Comments ( 0 )

No comments available

Add a comment
{{postCtrl.cmtErrMsg}}

Frequently asked questions ( 5 )

Q

How to find most frequently used words in Linux?

A

Use reverse search by pressing ctrl + R and type the word in terminal.

Q

How to check for the width?

A

In order to check use "-w" for it.

Q

How to directly append files in Linux?

A

Make use of ">>" with cat command so the input goes and append at the end of the file.

Q

How to display and see the last 100 lines of a file?

A

In order to display and see the last 100 lines of a file using " tail command" as "tailf -200 file path".

Q

What is the use of word count command in Linux?

A

wc-word count command prints newline, word and byte counts from file.

Back To Top!
Rank
User
Points

Top Contributers

userNamenaveelansari
135850

Top Contributers

userNameayanbhatti
92510

Top Contributers

userNamehamzaahmed
32150

Top Contributers

1
userNamelinuxhelp
31040

Top Contributers

userNamemuhammadali
24500
Can you help keel johnston ?
Unhide the folders on windows Explorer

Give any solutions to unhide folder using command prompt?

forum3

Networking
  • Routing
  • trunk
  • Netmask
  • Packet Capture
  • domain
  • HTTP Proxy
Server Setup
  • NFS
  • KVM
  • Memory
  • Sendmail
  • WebDAV
  • LXC
Shell Commands
  • Cloud commander
  • Command line archive tools
  • last command
  • Shell
  • terminal
  • Throttle
Desktop Application
  • Linux app
  • Pithos
  • Retrospect
  • Scribe
  • TortoiseHg
  • 4Images
Monitoring Tool
  • Monit
  • Apache Server Monitoring
  • EtherApe 
  • Arpwatch Tool
  • Auditd
  • Barman
Web Application
  • Nutch
  • Amazon VPC
  • FarmWarDeployer
  • Rukovoditel
  • Mirror site
  • Chef
Contact Us | Terms of Use| Privacy Policy| Disclaimer
© 2025 LinuxHelp.com All rights reserved. Linux™ is the registered trademark of Linus Torvalds. This site is not affiliated with linus torvalds in any way.