허프만 코딩 | 탐욕스러운 알고-3 - TECHCODEVIEW.COM

허프만 코딩은 무손실 데이터 압축 알고리즘입니다. 아이디어는 입력 문자에 가변 길이 코드를 할당하는 것이며, 할당된 코드의 길이는 해당 문자의 빈도를 기반으로 합니다.
입력 문자에 할당된 가변 길이 코드는 다음과 같습니다. 접두사 코드 는 한 문자에 할당된 코드가 다른 문자에 할당된 코드의 접두사가 아닌 방식으로 코드(비트 시퀀스)가 할당되었음을 의미합니다. 이것이 허프만 코딩이 생성된 비트스트림을 디코딩할 때 모호함이 없는지 확인하는 방법입니다.
반대 예를 들어 접두사 코드를 이해해 봅시다. 4개의 문자 a, b, c, d가 있고 해당 가변 길이 코드는 00, 01, 0, 1이라고 가정합니다. c에 할당된 코드는 a와 b에 할당된 코드의 접두어이기 때문에 이 코딩은 모호성을 초래합니다. 압축된 비트 스트림이 0001이면 압축 해제된 출력은 cccd, ccb, acd 또는 ab일 수 있습니다.
보다 이것 허프만 코딩의 응용을 위해.
허프만 코딩에는 크게 두 가지 부분이 있습니다.

입력 문자로부터 허프만 트리를 구축합니다.
허프만 트리를 탐색하고 캐릭터에 코드를 할당합니다.

연산:

최적의 접두사 코드를 구성하는 데 사용되는 방법은 다음과 같습니다. 허프만 코딩 .

이 알고리즘은 상향식 방식으로 트리를 구축합니다. 우리는 이 트리를 다음과 같이 나타낼 수 있습니다. 티

하자, |c| 잎의 개수가 되다

|c| -1은 노드를 병합하는 데 필요한 작업 수입니다. Q는 바이너리 힙을 구성하는 동안 사용할 수 있는 우선순위 큐입니다.

Algorithm Huffman (c) { n= |c| Q = c for i<-1 to n-1 do { temp <- get node () left (temp] Get_min (Q) right [temp] Get Min (Q) a = left [templ b = right [temp] F [temp]<- f[a] + [b] insert (Q, temp) } return Get_min (0) }>

허프만 트리 구축 단계
입력은 발생 빈도와 함께 고유한 문자의 배열이고 출력은 허프만 트리입니다.

각 고유 문자에 대해 리프 노드를 생성하고 모든 리프 노드의 최소 힙을 만듭니다. (최소 힙은 우선 순위 큐로 사용됩니다. 빈도 필드의 값은 최소 힙에 있는 두 노드를 비교하는 데 사용됩니다. 처음에는 가장 빈도가 낮은 문자는 다음과 같습니다. 뿌리)
최소 힙에서 최소 빈도를 갖는 두 개의 노드를 추출합니다.
두 노드 주파수의 합과 동일한 주파수를 가진 새 내부 노드를 만듭니다. 첫 번째 추출된 노드를 왼쪽 자식으로 만들고 다른 추출된 노드를 오른쪽 자식으로 만듭니다. 이 노드를 최소 힙에 추가합니다.
힙에 노드가 하나만 포함될 때까지 2단계와 3단계를 반복합니다. 나머지 노드는 루트 노드이고 트리가 완성됩니다.
예를 들어 알고리즘을 이해해 보겠습니다.

character Frequency a 5 b 9 c 12 d 13 e 16 f 45>

1 단계. 각 노드가 단일 노드가 있는 트리의 루트를 나타내는 6개의 노드를 포함하는 최소 힙을 구축합니다.
2 단계 최소 힙에서 두 개의 최소 주파수 노드를 추출합니다. 빈도가 5 + 9 = 14인 새 내부 노드를 추가합니다.

2단계 그림

이제 최소 힙에는 5개의 노드가 포함됩니다. 여기서 4개의 노드는 각각 단일 요소가 있는 트리의 루트이고, 하나의 힙 노드는 3개의 요소가 있는 트리의 루트입니다.

character Frequency c 12 d 13 Internal Node 14 e 16 f 45>

3단계: 힙에서 두 개의 최소 주파수 노드를 추출합니다. 빈도가 12 + 13 = 25인 새 내부 노드를 추가합니다.

3단계 그림

이제 최소 힙에는 4개의 노드가 포함됩니다. 여기서 2개의 노드는 각각 단일 요소가 있는 트리의 루트이고, 2개의 힙 노드는 둘 이상의 노드가 있는 트리의 루트입니다.

character Frequency Internal Node 14 e 16 Internal Node 25 f 45>

4단계: 두 개의 최소 주파수 노드를 추출합니다. 빈도가 14 + 16 = 30인 새 내부 노드를 추가합니다.

4단계 그림

이제 최소 힙에는 3개의 노드가 포함됩니다.

character Frequency Internal Node 25 Internal Node 30 f 45>

5단계: 두 개의 최소 주파수 노드를 추출합니다. 빈도가 25 + 30 = 55인 새 내부 노드를 추가합니다.

5단계 그림

이제 최소 힙에는 2개의 노드가 포함됩니다.

character Frequency f 45 Internal Node 55>

6단계: 두 개의 최소 주파수 노드를 추출합니다. 빈도가 45 + 55 = 100인 새 내부 노드를 추가합니다.

6단계 그림

이제 최소 힙에는 노드가 하나만 포함됩니다.

character Frequency Internal Node 100>

힙에는 노드가 하나만 포함되어 있으므로 알고리즘은 여기서 중지됩니다.

Huffman Tree에서 코드를 인쇄하는 단계:
루트부터 시작하여 형성된 트리를 탐색합니다. 보조 배열을 유지하십시오. 왼쪽 자식으로 이동하면서 배열에 0을 씁니다. 오른쪽 자식으로 이동하면서 배열에 1을 씁니다. 리프 노드를 만나면 배열을 인쇄합니다.

HuffmanTree에서 코드를 인쇄하는 단계

코드는 다음과 같습니다:

character code-word f 0 c 100 d 101 a 1100 b 1101 e 111>

권장 연습 허프만 인코딩 시도해 보세요!

다음은 위의 접근 방식을 구현한 것입니다.

씨

// C program for Huffman Coding> #include> #include> > // This constant can be avoided by explicitly> // calculating height of Huffman Tree> #define MAX_TREE_HT 100> > // A Huffman tree node> struct> MinHeapNode {> > >// One of the input characters> >char> data;> > >// Frequency of the character> >unsigned freq;> > >// Left and right child of this node> >struct> MinHeapNode *left, *right;> };> > // A Min Heap: Collection of> // min-heap (or Huffman tree) nodes> struct> MinHeap {> > >// Current size of min heap> >unsigned size;> > >// capacity of min heap> >unsigned capacity;> > >// Array of minheap node pointers> >struct> MinHeapNode** array;> };> > // A utility function allocate a new> // min heap node with given character> // and frequency of the character> struct> MinHeapNode* newNode(>char> data, unsigned freq)> {> >struct> MinHeapNode* temp = (>struct> MinHeapNode*)>malloc>(> >sizeof>(>struct> MinHeapNode));> > >temp->왼쪽 = 온도->오른쪽 = NULL;> >temp->데이터 = 데이터;> >temp->주파수 = 주파수;> > >return> temp;> }> > // A utility function to create> // a min heap of given capacity> struct> MinHeap* createMinHeap(unsigned capacity)> > {> > >struct> MinHeap* minHeap> >= (>struct> MinHeap*)>malloc>(>sizeof>(>struct> MinHeap));> > >// current size is 0> >minHeap->크기 = 0;> > >minHeap->용량 = 용량;> > >minHeap->배열 = (>struct> MinHeapNode**)>malloc>(> >minHeap->용량 *>sizeof>(>struct> MinHeapNode*));> >return> minHeap;> }> > // A utility function to> // swap two min heap nodes> void> swapMinHeapNode(>struct> MinHeapNode** a,> >struct> MinHeapNode** b)> > {> > >struct> MinHeapNode* t = *a;> >*a = *b;> >*b = t;> }> > // The standard minHeapify function.> void> minHeapify(>struct> MinHeap* minHeap,>int> idx)> > {> > >int> smallest = idx;> >int> left = 2 * idx + 1;> >int> right = 2 * idx + 2;> > >if> (left size> >&& minHeap->배열[왼쪽]->주파수> >array[smallest]->주파수)> >smallest = left;> > >if> (right size> >&& minHeap->배열[오른쪽]->주파수> >array[smallest]->주파수)> >smallest = right;> > >if> (smallest != idx) {> >swapMinHeapNode(&minHeap->배열[가장 작은],> >&minHeap->배열[idx]);> >minHeapify(minHeap, smallest);> >}> }> > // A utility function to check> // if size of heap is 1 or not> int> isSizeOne(>struct> MinHeap* minHeap)> {> > >return> (minHeap->크기 == 1);> }> > // A standard function to extract> // minimum value node from heap> struct> MinHeapNode* extractMin(>struct> MinHeap* minHeap)> > {> > >struct> MinHeapNode* temp = minHeap->배열[0];> >minHeap->배열[0] = minHeap->array[minHeap->size - 1];> > >--minHeap->크기;> >minHeapify(minHeap, 0);> > >return> temp;> }> > // A utility function to insert> // a new node to Min Heap> void> insertMinHeap(>struct> MinHeap* minHeap,> >struct> MinHeapNode* minHeapNode)> > {> > >++minHeap->크기;> >int> i = minHeap->크기 - 1;> > >while> (i> >&& minHeapNode->주파수> >array[(i - 1) / 2]->주파수) {> > >minHeap->배열[i] = minHeap->array[(i - 1) / 2];> >i = (i - 1) / 2;> >}> > >minHeap->배열[i] = minHeapNode;> }> > // A standard function to build min heap> void> buildMinHeap(>struct> MinHeap* minHeap)> > {> > >int> n = minHeap->크기 - 1;> >int> i;> > >for> (i = (n - 1) / 2; i>= 0; --i)> >minHeapify(minHeap, i);> }> > // A utility function to print an array of size n> void> printArr(>int> arr[],>int> n)> {> >int> i;> >for>

(i = 0; i   printf('%d', arr[i]);     printf('
');  }    // Utility function to check if this node is leaf  int isLeaf(struct MinHeapNode* root)    {     return !(root->왼쪽) && !(루트->오른쪽);  } // 크기와 동일한 // 용량의 최소 힙을 생성하고 // data[]의 모든 문자를 최소 힙에 삽입합니다. // 최소 힙의 초기 크기는 용량과 같습니다. struct MinHeap* createAndBuildMinHeap(char data[], int freq[], int size) { struct MinHeap* minHeap = createMinHeap(size);     for (int i = 0; i minHeap->array[i] = newNode(data[i], freq[i]); minHeap->size = size; buildMinHeap(minHeap); return minHeap; } // 주요 함수 허프만 트리를 생성합니다 struct MinHeapNode* buildHuffmanTree(char data[], int freq[], int size) { struct MinHeapNode *left, *right, *top; // 1단계: // 크기와 동일한 용량의 최소 힙을 만듭니다. . 처음에는 크기와 동일한 모드가 있습니다. struct MinHeap* minHeap = createAndBuildMinHeap(data, freq, size); // 힙 크기가 1이 되지 않는 동안 반복 while (!isSizeOne(minHeap)) { // 2단계: 최소 힙에서 // 두 개의 최소 빈도 항목 추출 left = extractMin(minHeap); right = extractMin(minHeap); // 3단계: // 빈도의 합과 동일한 새 내부 노드를 만듭니다. // 두 개의 추출된 노드를 이 새 노드의 // 왼쪽 및 오른쪽 자식으로 만듭니다. // 이 노드를 최소 힙에 추가합니다. // '$'는 //가 아닌 내부 노드에 대한 특수 값입니다. 사용 top = newNode('$', 왼쪽->주파수 + 오른쪽->주파수);     상단->왼쪽 = 왼쪽;   상단->오른쪽 = 오른쪽;     insertMinHeap(최소힙, 상단);   } // 4단계: 나머지 노드는 // 루트 노드이고 트리가 완성됩니다.   extractMin(minHeap)을 반환합니다.  } // 허프만 트리의 루트에서 허프만 코드를 인쇄합니다.  // arr[]를 사용하여 코드를 저장합니다. void printCodes(struct MinHeapNode* root, int arr[], int top) { // 왼쪽 가장자리에 0을 할당하고 반복합니다. if (root->left) { arr[top] = 0 ;   printCodes(root->left, arr, top + 1);   } // 오른쪽 가장자리에 1을 할당하고 반복됩니다. if (root->right) { arr[top] = 1;   printCodes(root->right, arr, top + 1);   } // 리프 노드인 경우 // 입력 문자 중 하나가 포함되어 // arr[]에서 문자와 해당 코드를 인쇄합니다. if (isLeaf(root)) { printf('%c: ', 루트->데이터);   printArr(arr, top);   } } // 허프만 트리를 구축하고 // 구축된 허프만 트리를 순회하여 // 코드를 인쇄하는 주요 함수 void HuffmanCodes(char data[], int freq[], int size) { // 허프만 트리 구축 struct MinHeapNode* root = buildHuffmanTree(데이터, 주파수, 크기);     // 위에 구축된 허프만 트리를 사용하여 허프만 코드를 인쇄합니다. int arr[MAX_TREE_HT], top = 0;     printCodes(루트, arr, 상단);  } // 드라이버 코드 int main() { char arr[] = { 'a', 'b', 'c', 'd', 'e', 'f ' };   int freq[] = { 5, 9, 12, 13, 16, 45 };     int 크기 = sizeof(arr) / sizeof(arr[0]);     HuffmanCodes(arr, freq, size);     0을 반환합니다.  }>

C++

// C++ program for Huffman Coding> #include> #include> using> namespace> std;> > // This constant can be avoided by explicitly> // calculating height of Huffman Tree> #define MAX_TREE_HT 100> > // A Huffman tree node> struct> MinHeapNode {> > >// One of the input characters> >char> data;> > >// Frequency of the character> >unsigned freq;> > >// Left and right child of this node> >struct> MinHeapNode *left, *right;> };> > // A Min Heap: Collection of> // min-heap (or Huffman tree) nodes> struct> MinHeap {> > >// Current size of min heap> >unsigned size;> > >// capacity of min heap> >unsigned capacity;> > >// Array of minheap node pointers> >struct> MinHeapNode** array;> };> > // A utility function allocate a new> // min heap node with given character> // and frequency of the character> struct> MinHeapNode* newNode(>char> data, unsigned freq)> {> >struct> MinHeapNode* temp = (>struct> MinHeapNode*)>malloc>(> >sizeof>(>struct> MinHeapNode));> > >temp->왼쪽 = 온도->오른쪽 = NULL;> >temp->데이터 = 데이터;> >temp->주파수 = 주파수;> > >return> temp;> }> > // A utility function to create> // a min heap of given capacity> struct> MinHeap* createMinHeap(unsigned capacity)> > {> > >struct> MinHeap* minHeap> >= (>struct> MinHeap*)>malloc>(>sizeof>(>struct> MinHeap));> > >// current size is 0> >minHeap->크기 = 0;> > >minHeap->용량 = 용량;> > >minHeap->배열 = (>struct> MinHeapNode**)>malloc>(> >minHeap->용량 *>sizeof>(>struct> MinHeapNode*));> >return> minHeap;> }> > // A utility function to> // swap two min heap nodes> void> swapMinHeapNode(>struct> MinHeapNode** a,> >struct> MinHeapNode** b)> > {> > >struct> MinHeapNode* t = *a;> >*a = *b;> >*b = t;> }> > // The standard minHeapify function.> void> minHeapify(>struct> MinHeap* minHeap,>int> idx)> > {> > >int> smallest = idx;> >int> left = 2 * idx + 1;> >int> right = 2 * idx + 2;> > >if> (left size> >&& minHeap->배열[왼쪽]->주파수> >array[smallest]->주파수)> >smallest = left;> > >if> (right size> >&& minHeap->배열[오른쪽]->주파수> >array[smallest]->주파수)> >smallest = right;> > >if> (smallest != idx) {> >swapMinHeapNode(&minHeap->배열[가장 작은],> >&minHeap->배열[idx]);> >minHeapify(minHeap, smallest);> >}> }> > // A utility function to check> // if size of heap is 1 or not> int> isSizeOne(>struct> MinHeap* minHeap)> {> > >return> (minHeap->크기 == 1);> }> > // A standard function to extract> // minimum value node from heap> struct> MinHeapNode* extractMin(>struct> MinHeap* minHeap)> > {> > >struct> MinHeapNode* temp = minHeap->배열[0];> >minHeap->배열[0] = minHeap->array[minHeap->size - 1];> > >--minHeap->크기;> >minHeapify(minHeap, 0);> > >return> temp;> }> > // A utility function to insert> // a new node to Min Heap> void> insertMinHeap(>struct> MinHeap* minHeap,> >struct> MinHeapNode* minHeapNode)> > {> > >++minHeap->크기;> >int> i = minHeap->크기 - 1;> > >while> (i> >&& minHeapNode->주파수> >array[(i - 1) / 2]->주파수) {> > >minHeap->배열[i] = minHeap->array[(i - 1) / 2];> >i = (i - 1) / 2;> >}> > >minHeap->배열[i] = minHeapNode;> }> > // A standard function to build min heap> void> buildMinHeap(>struct> MinHeap* minHeap)> > {> > >int> n = minHeap->크기 - 1;> >int> i;> > >for> (i = (n - 1) / 2; i>= 0; --i)> >minHeapify(minHeap, i);> }> > // A utility function to print an array of size n> void> printArr(>int> arr[],>int> n)> {> >int> i;> >for>

(i = 0; i   cout << arr[i];     cout << '
';  }    // Utility function to check if this node is leaf  int isLeaf(struct MinHeapNode* root)    {     return !(root->왼쪽) && !(루트->오른쪽);  } // 크기와 동일한 // 용량의 최소 힙을 생성하고 // data[]의 모든 문자를 최소 힙에 삽입합니다. // 최소 힙의 초기 크기는 용량과 같습니다. struct MinHeap* createAndBuildMinHeap(char data[], int freq[], int size) { struct MinHeap* minHeap = createMinHeap(size);     for (int i = 0; i minHeap->array[i] = newNode(data[i], freq[i]); minHeap->size = size; buildMinHeap(minHeap); return minHeap; } // 주요 함수 허프만 트리를 생성합니다 struct MinHeapNode* buildHuffmanTree(char data[], int freq[], int size) { struct MinHeapNode *left, *right, *top; // 1단계: // 크기와 동일한 용량의 최소 힙을 만듭니다. . 처음에는 크기와 동일한 모드가 있습니다. struct MinHeap* minHeap = createAndBuildMinHeap(data, freq, size); // 힙 크기가 1이 되지 않는 동안 반복 while (!isSizeOne(minHeap)) { // 2단계: 최소 힙에서 // 두 개의 최소 빈도 항목 추출 left = extractMin(minHeap); right = extractMin(minHeap); // 3단계: // 빈도의 합과 동일한 새 내부 노드를 만듭니다. // 두 개의 추출된 노드를 이 새 노드의 // 왼쪽 및 오른쪽 자식으로 만듭니다. // 이 노드를 최소 힙에 추가합니다. // '$'는 //가 아닌 내부 노드에 대한 특수 값입니다. 사용 top = newNode('$', 왼쪽->주파수 + 오른쪽->주파수);     상단->왼쪽 = 왼쪽;   상단->오른쪽 = 오른쪽;     insertMinHeap(최소힙, 상단);   } // 4단계: 나머지 노드는 // 루트 노드이고 트리가 완성됩니다.   extractMin(minHeap)을 반환합니다.  } // 허프만 트리의 루트에서 허프만 코드를 인쇄합니다.  // arr[]를 사용하여 코드를 저장합니다. void printCodes(struct MinHeapNode* root, int arr[], int top) { // 왼쪽 가장자리에 0을 할당하고 반복합니다. if (root->left) { arr[top] = 0 ;   printCodes(root->left, arr, top + 1);   } // 오른쪽 가장자리에 1을 할당하고 반복됩니다. if (root->right) { arr[top] = 1;   printCodes(root->right, arr, top + 1);   } // 리프 노드인 경우 // 입력 문자 중 하나가 포함되어 // arr[]에서 문자와 해당 코드를 인쇄합니다. if (isLeaf(root)) { cout ': ';   printArr(arr, top);   } } // 허프만 트리를 구축하고 // 구축된 허프만 트리를 순회하여 // 코드를 인쇄하는 주요 함수 void HuffmanCodes(char data[], int freq[], int size) { // 허프만 트리 구축 struct MinHeapNode* root = buildHuffmanTree(데이터, 주파수, 크기);     // 위에 구축된 허프만 트리를 사용하여 허프만 코드를 인쇄합니다. // int arr[MAX_TREE_HT], top = 0;     printCodes(루트, arr, 상단);  } // 드라이버 코드 int main() { char arr[] = { 'a', 'b', 'c', 'd', 'e', 'f ' };   int freq[] = { 5, 9, 12, 13, 16, 45 };     int 크기 = sizeof(arr) / sizeof(arr[0]);     HuffmanCodes(arr, freq, size);     0을 반환합니다.  }>

C++

// C++(STL) program for Huffman Coding with STL> #include> using> namespace> std;> > // A Huffman tree node> struct> MinHeapNode {> > >// One of the input characters> >char> data;> > >// Frequency of the character> >unsigned freq;> > >// Left and right child> >MinHeapNode *left, *right;> > >MinHeapNode(>char> data, unsigned freq)> > >{> > >left = right = NULL;> >this>->데이터 = 데이터;> >this>->주파수 = 주파수;> >}> };> > // For comparison of> // two heap nodes (needed in min heap)> struct> compare {> > >bool> operator()(MinHeapNode* l, MinHeapNode* r)> > >{> >return> (l->주파수> r->주파수);> >}> };> > // Prints huffman codes from> // the root of Huffman Tree.> void> printCodes(>struct> MinHeapNode* root, string str)> {> > >if> (!root)> >return>;> > >if> (root->데이터 !=>'$'>)> >

cout ': ' << str << '
';     printCodes(root->왼쪽, str + '0');   printCodes(root->right, str + '1');  } // 허프만 트리를 구축하고 // 구축된 허프만 트리를 순회하여 코드를 인쇄하는 주요 함수 void HuffmanCodes(char data[], int freq[], int size) { struct MinHeapNode *left, *right, *top;     // 최소 힙을 생성하고 data[]의 모든 문자를 삽입합니다. Priority_queue 비교> minHeap;     for (int i = 0; i minHeap.push(new MinHeapNode(data[i], freq[i]))); // 힙 크기가 1이 되지 않는 동안 반복 while (minHeap.size() != 1 ) { // 최소 힙에서 // 두 개의 최소 빈도 항목을 추출합니다. left = minHeap.top(); right = minHeap.top(); // 새 내부 노드를 만듭니다. // 빈도는 두 노드 빈도의 합과 동일합니다. // 두 개의 추출된 노드를 이 새 노드의 // 왼쪽 및 오른쪽 하위 항목으로 만듭니다. // 이 노드를 최소 힙 '$'에 추가합니다. // 내부 노드용 특수 값 top = new MinHeapNode('$', left->freq + right->freq) top->left = left->right = minHeap.push; (top); } // // 위에 구축된 Huffman 트리를 사용하여 Huffman 코드를 인쇄합니다. printCodes(minHeap.top(), '') } // 드라이버 코드 int main() { char arr[] = { ' a', 'b', 'c', 'd', 'e', 'f' }; int freq[] = { 5, 9, 12, 13, 16 , 45 }; int size = sizeof(arr) / sizeof(arr[0]);     0을 반환합니다.  } // 이 코드는 Aditya Goel이 제공한 것입니다.>

자바 이진 트리

자바

import> java.util.Comparator;> import> java.util.PriorityQueue;> import> java.util.Scanner;> > class> Huffman {> > >// recursive function to print the> >// huffman-code through the tree traversal.> >// Here s is the huffman - code generated.> >public> static> void> printCode(HuffmanNode root, String s)> >{> > >// base case; if the left and right are null> >// then its a leaf node and we print> >// the code s generated by traversing the tree.> >if> (root.left ==>null> && root.right ==>null> >&& Character.isLetter(root.c)) {> > >// c is the character in the node> >System.out.println(root.c +>':'> + s);> > >return>;> >}> > >// if we go to left then add '0' to the code.> >// if we go to the right add'1' to the code.> > >// recursive calls for left and> >// right sub-tree of the generated tree.> >printCode(root.left, s +>'0'>);> >printCode(root.right, s +>'1'>);> >}> > >// main function> >public> static> void> main(String[] args)> >{> > >Scanner s =>new> Scanner(System.in);> > >// number of characters.> >int> n =>6>;> >char>[] charArray = {>'a'>,>'b'>,>'c'>,>'d'>,>'e'>,>'f'> };> >int>[] charfreq = {>5>,>9>,>12>,>13>,>16>,>45> };> > >// creating a priority queue q.> >// makes a min-priority queue(min-heap).> >PriorityQueue q> >=>new> PriorityQueue(> >n,>new> MyComparator());> > >for> (>int> i =>0>

; i     // creating a Huffman node object   // and add it to the priority queue.   HuffmanNode hn = new HuffmanNode();     hn.c = charArray[i];   hn.data = charfreq[i];     hn.left = null;   hn.right = null;     // add functions adds   // the huffman node to the queue.   q.add(hn);   }     // create a root node   HuffmanNode root = null;     // Here we will extract the two minimum value   // from the heap each time until   // its size reduces to 1, extract until   // all the nodes are extracted.   while (q.size()>1) { // 첫 번째 분 추출.   HuffmanNode x = q.peek();   q.poll();     // 두 번째 최소 추출.   HuffmanNode y = q.peek();   q.poll();     // 동일한 새 노드 f HuffmanNode f = new HuffmanNode();     // 두 노드의 빈도의 합으로 // f 노드에 값을 할당합니다.   f.data = x.data + y.data;   f.c = '-';     // 첫 번째 노드를 왼쪽 자식으로 추출했습니다.   f.왼쪽 = x;     // 두 번째로 추출된 노드가 오른쪽 자식입니다.   f.오른쪽 = y;     // f 노드를 루트 노드로 표시합니다.   루트 = f;     // 이 노드를 우선순위 큐에 추가합니다.   q.추가(f);   } // 트리를 탐색하여 코드를 인쇄합니다. printCode(root, '');   } } // 노드 클래스는 허프만 트리에 존재하는 // 각 노드의 기본 구조입니다.  클래스 HuffmanNode { int 데이터;   문자 C;     허프만노드 왼쪽;   HuffmanNode 오른쪽;  } // 비교기 클래스는 속성 중 하나를 기준으로 // 노드를 비교하는 데 도움이 됩니다.  // 여기서는 노드의 데이터 값을 기준으로 // 비교하겠습니다.  MyComparator 클래스는 Comparator { public int Compare(HuffmanNode x, HuffmanNode y) { return x.data - y.data;를 구현합니다.   } } // 이 코드는 Kunwar Desh Deepak Singh이 제공한 것입니다.>

파이썬3

# A Huffman Tree Node> import> heapq> > > class> node:> >def> __init__(>self>, freq, symbol, left>=>None>, right>=>None>):> ># frequency of symbol> >self>.freq>=> freq> > ># symbol name (character)> >self>.symbol>=> symbol> > ># node left of current node> >self>.left>=> left> > ># node right of current node> >self>.right>=> right> > ># tree direction (0/1)> >self>.huff>=> ''> > >def> __lt__(>self>, nxt):> >return> self>

.freq      # utility function to print huffman  # codes for all symbols in the newly  # created Huffman tree  def printNodes(node, val=''):     # huffman code for current node   newVal = val + str(node.huff)     # if node is not an edge node   # then traverse inside it   if(node.left):   printNodes(node.left, newVal)   if(node.right):   printNodes(node.right, newVal)     # if node is edge node then   # display its huffman code   if(not node.left and not node.right):   print(f'{node.symbol} ->{newVal}') # 허프만 트리 문자 = ['a', 'b', 'c', 'd', 'e', 'f'] # 문자의 빈도 freq = [5, 9, 12, 13, 16, 45] # 사용되지 않은 노드가 포함된 목록 node = [] # 문자와 빈도를 # 허프만 트리로 변환 node for x in range(len(chars)): heapq .heappush(nodes, node(freq[x], chars[x])) while len(nodes)> 1: # 모든 노드를 빈도에 따라 # 오름차순으로 정렬 left = heapq.heappop(nodes) right = heapq .heappop(nodes) # 이 노드에 방향 값을 할당합니다. left.huff = 0 right.huff = 1 # 2개의 가장 작은 노드를 결합하여 # 새 노드를 상위 노드로 만듭니다. newNode = node(left.freq+right.freq, left. 기호+right.symbol, 왼쪽, 오른쪽) heapq.heappush(nodes, newNode) # 허프만 트리가 준비되었습니다!  printNodes(노드[0])>

자바스크립트

> > // node class is the basic structure> // of each node present in the Huffman - tree.> class HuffmanNode> {> >constructor()> >{> >this>.data = 0;> >this>.c =>''>;> >this>.left =>this>.right =>null>;> >}> }> > // recursive function to print the> >// huffman-code through the tree traversal.> >// Here s is the huffman - code generated.> >function> printCode(root,s)> >{> >// base case; if the left and right are null> >// then its a leaf node and we print> >// the code s generated by traversing the tree.> >if> (root.left ==>null> >&& root.right ==>null> >&& (root.c).toLowerCase() != (root.c).toUpperCase()) {> > >// c is the character in the node> >document.write(root.c +>':'> + s+>' '>);> > >return>;> >}> > >// if we go to left then add '0' to the code.> >// if we go to the right add'1' to the code.> > >// recursive calls for left and> >// right sub-tree of the generated tree.> >printCode(root.left, s +>'0'>);> >printCode(root.right, s +>'1'>);> >}> > >// main function> // number of characters.> >let n = 6;> >let charArray = [>'a'>,>'b'>,>'c'>,>'d'>,>'e'>,>'f'> ];> >let charfreq = [ 5, 9, 12, 13, 16, 45 ];> > >// creating a priority queue q.> >// makes a min-priority queue(min-heap).> >let q = [];> > >for>

(let i = 0; i     // creating a Huffman node object   // and add it to the priority queue.   let hn = new HuffmanNode();     hn.c = charArray[i];   hn.data = charfreq[i];     hn.left = null;   hn.right = null;     // add functions adds   // the huffman node to the queue.   q.push(hn);   }     // create a root node   let root = null;   q.sort(function(a,b){return a.data-b.data;});     // Here we will extract the two minimum value   // from the heap each time until   // its size reduces to 1, extract until   // all the nodes are extracted.   while (q.length>1) { // 첫 번째 분 추출.   x = q[0]이라고 하자;   q.shift();     // 두 번째 최소 추출.   y = q[0]이라고 하자;   q.shift();     // 동일한 새 노드 f let f = new HuffmanNode();     // 두 노드의 빈도의 합으로 // f 노드에 값을 할당합니다.   f.data = x.data + y.data;   f.c = '-';     // 첫 번째 노드를 왼쪽 자식으로 추출했습니다.   f.왼쪽 = x;     // 두 번째로 추출된 노드가 오른쪽 자식입니다.   f.오른쪽 = y;     // f 노드를 루트 노드로 표시합니다.   루트 = f;     // 이 노드를 우선순위 큐에 추가합니다.   q.push(f);   q.sort(function(a,b){return a.data-b.data;});   } // 트리를 탐색하여 코드를 인쇄합니다. printCode(root, '');    // 이 코드는 avanitrachhadiya2155가 제공한 것입니다.>

씨#

// C# program for the above approach> > using> System;> using> System.Collections.Generic;> > // A Huffman tree node> public> class> MinHeapNode> {> >// One of the input characters> >public> char> data;> > >// Frequency of the character> >public> uint> freq;> > >// Left and right child> >public> MinHeapNode left, right;> > >public> MinHeapNode(>char> data,>uint> freq)> >{> >left = right =>null>;> >this>.data = data;> >this>.freq = freq;> >}> }> > // For comparison of two heap nodes (needed in min heap)> public> class> CompareMinHeapNode : IComparer> {> >public> int> Compare(MinHeapNode x, MinHeapNode y)> >{> >return> x.freq.CompareTo(y.freq);> >}> }> > class> Program> {> >// Prints huffman codes from the root of Huffman Tree.> >static> void> printCodes(MinHeapNode root,>string> str)> >{> >if> (root ==>null>)> >return>;> > >if> (root.data !=>'$'>)> >Console.WriteLine(root.data +>': '> + str);> > >printCodes(root.left, str +>'0'>);> >printCodes(root.right, str +>'1'>);> >}> > >// The main function that builds a Huffman Tree and> >// print codes by traversing the built Huffman Tree> >static> void> HuffmanCodes(>char>[] data,>uint>[] freq,>int> size)> >{> >MinHeapNode left, right, top;> > >// Create a min heap & inserts all characters of data[]> >var> minHeap =>new> SortedSet(>new> CompareMinHeapNode());> > >for> (>int>

i = 0; i   minHeap.Add(new MinHeapNode(data[i], freq[i]));     // Iterate while size of heap doesn't become 1   while (minHeap.Count != 1)   {   // Extract the two minimum freq items from min heap   left = minHeap.Min;   minHeap.Remove(left);     right = minHeap.Min;   minHeap.Remove(right);     // Create a new internal node with frequency equal to the sum of the two nodes frequencies.   // Make the two extracted node as left and right children of this new node.   // Add this node to the min heap '$' is a special value for internal nodes, not used.   top = new MinHeapNode('$', left.freq + right.freq);     top.left = left;   top.right = right;     minHeap.Add(top);   }     // Print Huffman codes using the Huffman tree built above   printCodes(minHeap.Min, '');   }     // Driver Code   static void Main()   {   char[] arr = { 'a', 'b', 'c', 'd', 'e', 'f' };   uint[] freq = { 5, 9, 12, 13, 16, 45 };     int size = arr.Length;     HuffmanCodes(arr, freq, size);   }  }    // This code is contributed by sdeadityasharma>

산출

f: 0 c: 100 d: 101 a: 1100 b: 1101 e: 111>

시간 복잡도: O(nlogn) 여기서 n은 고유 문자 수입니다. n개의 노드가 있는 경우 extractMin()은 2*(n – 1)번 호출됩니다. extractMin()은 minHeapify()를 호출하므로 O(logn) 시간이 걸립니다. 따라서 전체 복잡도는 O(nlogn)입니다.
입력 배열이 정렬된 경우 선형 시간 알고리즘이 존재합니다. 이에 대해서는 곧 다음 게시물에서 논의하겠습니다.

공간 복잡도 :- O(N)

허프만 코딩의 응용:

팩스 및 문자 전송에 사용됩니다.
PKZIP, GZIP 등과 같은 기존 압축 형식에서 사용됩니다.
JPEG, PNG 및 MP3와 같은 멀티미디어 코덱은 Huffman 인코딩(더 정확하게는 접두사 코드)을 사용합니다.

자주 등장하는 일련의 문자가 있는 경우에 유용합니다.

참조:
http://en.wikipedia.org/wiki/Huffman_coding
이 기사는 Aashish Barnwal이 편집하고 techcodeview.com 팀이 검토했습니다.

TechCodeview

허프만 코딩 | 욕심 많은 것-3

연산:

씨

C++

C++

자바

파이썬3

자바스크립트

씨#

허프만 코딩의 응용: