Text processing commands in Linux

Filters/Text processing commands

cut
- allows us to cut the output, takes input of command/content of file and cut it to our desired output
awk
- list by columns
grep and egrep
- search by the keyword. Like searching for a specific word.
sort
- sorts out the output in alphabetical order
uniq
- not show duplicates in a file.
wc
- word count command waca command. tells us how many words, letters, line we've in a file/output.

Cut is a command line utility that allows us to cut parts of lines from specified files or piped data and print the result to stdout. It can be used to cut parts of a line by delimiter, byte position, and character.
cut filename = doesn't work you've to specify options in this command
cut --version = check utility version.
cut -c1 filename = list one character
cut -c1,2,4 filename = Pick and chose character
cut -c1-3 filename = list range of characters
cut -c1-3,6-8 filename = list specific range of characters.
cut -b1-3 filename = list by byte size
cut -d: -f 6 /etc/passwd = list first 6th column separated by :
ls -l | cut -c2-4 = only print user permissions of files/dir

awk is a utility/language designed for data extraction. Most of the time it is used to extract fields from a file or from an output.
check version : awk --version
print 1st column of character in a file: awk '{print $1}' file
second column/field: awk '{print $2}' file
output specific columns: ls -l | awk '{print $1,$3}'
last field of the output: ls -l | awk '{print $NF}'
search for specific keyword: awk '/dev/ {print}' tx
output only first fields of file: awk -F: '{print $1}' /etc/passwd
replace words fields words: echo "Hello Tushar" | awk '{$2="Rajpoot"; print $0}'
- $0 means whatever you find print it now.
replace words in output of file: cat tx | awk '{$1="Tushar"; print$0}'
get lines that have more than 3 bytes: awk 'length($0) > 3' tx
Get the field matching seinfeld in /home/tushar: ls -l | awk '{if($9 == "seinfeld") print $0;}'
Number of fields: ls -l | awk '{print NF}'

The grep command which stands for global regular expression print processes text line by line and prints any lines which match a specific pattern.
Check version or help: grep --version, grep --help
Search for a keyword from a file: grep keyword file
Search for a keyword and count: grep -c keyword file
Search for a keyword and ignore case-sensitive: grep -i KEYword file
Display the matched lines and their line numbers: grep -n keyword file
Display everything but keyword(exclude a keyword): grep -v keyword file
Search for a keyword and then only give the 1st field: grep keyword file | awk '{print $1}'
Search for a keyword and then only give the 1st field: ls -l | grep Desktop
Search for 2 keywords: egrep -i "keyword|keyword2" file

wc command either reads stdin or stdout or a list of files and generates: newline count, word count and byte count
check file line count, word count and byte count: wc file
Get the number of lines in a file: wc -l file
Number of words: wc -w file
Number of bytes: wc -c file
NOT ALLOWED: wc DIRECTORY
Number of files: ls -l | wc -l
Count directories only: ls -ld */ | wc -l
Number of keyword lines: grep keyword | wc -l