Annotation of cvs_filter/cvs_Lu.sh, Revision 1.3
1.1 hako 1: #! /bin/sh
1.3 ! hako 2: # $Id: cvs_Lu.sh,v 1.2 2020/04/30 13:58:30 hako Exp $
1.1 hako 3: #
4: # cvs_Lu.sh is a filter for CVS.
5: #
6: # Convert the line break of *.csv files to LF.
7: # Preserve the UTF-8 BOM.
8: # Delete the UTF-8 BOM from ASCII csv files.
9: #
10: # 20200430
11: #
12: # by Hiroshi Hakoyama
13: #
14: # Background:
1.2 hako 15: # CVS needs LF, but Excel provides CRLF for *.csv files.
16: # Excel needs the BOM to read UTF-8 encoding. Therefore, Excel provides the UTF-8 BOM in a UTF-8 csv file made by a command "CVS UTF-8 (Comma delimited) (.csv)".
17: # The BOM does not cause trouble for read.csv().
18: # Excel can read a csv file with LF.
1.3 ! hako 19: # Therefore, before commit UTF-8 csv files to CVS server, we should convert the line breaks to LF and preserve the UTF-8 BOM.
1.1 hako 20: #
21: # Solution:
22: # A CVS filter to change line breaks to LF for *.csv files.
23: #
24: # Usage:
25: # Add an alias to tcsh
26: # alias cvs 'cvs_Lu.sh'
27:
28: if [ "$1" = "commit" ]; then
29: find . -type f -name '*.csv' -exec nkf -g {} + | grep "UTF-8" | sed -e 's/: UTF-8//g' | tr '\n' '\0' | xargs -0 nkf --oc=UTF-8-BOM -Lu --in-place
30: find . -type f -name '*.csv' -exec nkf -g {} + | grep "ASCII" | sed -e 's/: ASCII//g' | tr '\n' '\0' | xargs -0 nkf --oc=UTF-8 -Lu --in-place
31: find . -type f -name '*.csv' -exec nkf -g {} + | grep "Shift_JIS" | sed -e 's/: Shift_JIS//g' | tr '\n' '\0' | xargs -0 nkf --oc=Shift_JIS -Lu --in-place
32: cvs "$@"
33: else
34: cvs "$@"
35: fi
36:
1.2 hako 37: exit 0