本来不想转,但真心是有用,以前的方法也有点过时,而且主要是不方便,所以就来贴个更全的。这种问题非常容易 遇到,用ubuntu、debian在阿里云上,原来都正常的,你只要一apt-get update一下,原来的LC环境就全没了。
原文地址在:http://perlgeek.de/en/article/set-up-a-clean-utf8-environment , 我上一次写关于这个的问题是在2012年了,那篇的标题是:perl: warning: Setting locale failed. 也可以看一下
How to set up a clean UTF-8 environment in Linux
Many people have problems with handling non-ASCII characters in their programs, or even getting their IRC client or text editor to display them correctly.
To efficiently work with text data, your environment has to be set up properly - it is so much easier to debug a problem which has encoding issues if you can trust your terminal to correctly display correct UTF-8.
I will show you how to set up such a clean environment on Debian Lenny, but most things work independently of the distribution, and parts of it even work on other Unix-flavored operating systems like MacOS X.
Choosing an encoding
In the end the used character encoding doesn't matter much, as long as it's a Unicode encoding, i.e. one which can be used to encode all Unicode characters.
UTF-8 is usually a good choice because it efficiently encodes ASCII data too, and the character data I typically deal with still has a high percentage of ASCII chars. It is also used in many places, and thus one can often avoid conversions.
Whatever you do, chose one encoding and stick to it, for your whole system. On Linux that means text files, file names, locales and all text based applications (mutt, slrn, vim, irssi, ...).
For the rest of this article I assume UTF-8, but it should work very similarly for other character encodings.
Locales: installing
Check that you have the locales
package installed. On Debian you can do that with.
$ dpkg -l locales Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Cfg-files/Unpacked/Failed-cfg/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad) ||/ Name Version Description +++-==============-==============-============================================ ii locales 2.7-18 GNU C Library: National Language (locale) da
The last line is the important one: if it starts with ii
, the package is installed, and everything is fine. If not, install it. As root, type
$ aptitude install locales
If you get a dialog asking for details, read on to the next section.
Locales: generation
make sure that on your system an UTF-8 locale is generated. As root, type
$ dpkg-reconfigure locales
You'll see a long list of locales, and you can navigate that list with the up/down arrow keys. Pressing the space bar toggles the locale under the cursor. Make sure to select at least one UTF-8 locale, for example en_US-UTF-8
is usually supported very well. (The first part of the locale name stands for the language, the second for the country or dialect, and the third for the character encoding).
In the next step you have the option to make one of the previously selected locales the default. Picking a default UTF-8 locale as default is usually a good idea, though it might change how some programs work, and thus shouldn't be done servers hosting sensitive applications.
If you chose a default locale in the previous step, log out completely and then log in again. In any case you can configure your per-user environment with environment variables.
The following variables can effect programs: LANG, LANGUAGE, LC_CTYPE, LC_NUMERIC, LC_TIME, LC_COLLATE, LC_MONETARY, LC_MESSAGES, LC_PAPER, LC_NAME, LC_ADDRESS, LC_TELEPHONE, LC_MEASUREMENT, LC_IDENTIFICATION.
Most of the time it works to set all of these to the same value. Instead of setting all LC_ variables separately, you can set theLC_ALL
. If you use bash as your shell, you can put these lines in your ~/.bashrc
and ~/.profile
files:
export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8 export LANGUAGE=en_US.UTF-8
To make these changes active in the current shell, source the .bashrc:
$ source ~/.bashrc
All newly started interactive bash processes will respect these settings.
A Warning about Non-Interactive Processes
There are certain processes that don't get those environment variables, typically because they are started by some sort of daemon in the background.
Those include processes started from cron, at, init scripts, or indirectly spawned from init scripts, like through a web server.
You might need to take additional steps to ensure that those programs get the proper environment variables.
Locales: check
Run the locale
program. The output should be similar to this:
LANG=en_US.UTF-8 LANGUAGE=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=en_US.UTF-8
If not you've made a mistake in one of the previous steps, and need to recheck what you did.
Setting up the terminal emulator
Setting up the terminal emulator for your terminal emulator strongly depends on what you actually use. If you use xterm
, you can start it as xterm -en utf-8
, konsole and the Gnome Terminal can be configured in their respective configuration menus.
Testing the terminal emulator
To test if you terminal emulator works, copy and paste this line in your shell:
perl -Mcharnames=:full -CS -wle 'print "\N{EURO SIGN}"'
This should print a Euro sign €
on the console. If it prints a single question mark instead, your fonts might not contain it. Try installing additional fonts. If multiple different (nonsensical) characters are shown, the wrong character encoding is configured. Keep trying :-).
SSH
If you use SSH to log in into another machine, repeat the previous steps, making sure that the locale is set correctly, and that you can view a non-ASCII character like the Euro sign.
Screen
The screen program can work with UTF-8 if you tell it to.
The easiest (and sometimes the only) way is to start it with the -U
option:
$ screen -U
and also when detaching (screen -Urd
or so).
Inside a running screen you can try Ctrl+a :utf8 on<return>
. If that doesn't work, exit your screen and start a new one with -U
There's a complete guide for setting up irssi to use UTF-8, which partially overlaps with this one. The gist is:
/set term_charset utf-8 /set recode_autodetect_utf8 ON /set recode_fallback ISO-8859-15 /set recode ON
gogos 默认是 3000端口,如果你本身已经有了其他WEB服务,比如apache,当然不可能同时占用80端口,这时候就可以利用apapche的proxy功能。如果你是apt-get 安装 的apache,就方便了,debian/ubuntu 都支持a2ensiate,a2enmod这样的小脚本。你只要打开a2enmod proxy proxy_http就行了,然后新建一个site加入以下配置
XML/HTML代码
- <VirtualHost *:80>
- ServerAdmin test@test.com
- ServerName yoursitename
- ProxyRequests off
- ProxyPass / http://127.0.0.1:3000/
- ProxyPassReverse / http://127.0.0.1:3000
- <Proxy *>
- Order Deny,Allow
- Allow from all
- </Proxy>
- ErrorLog ${APACHE_LOG_DIR}/yoursitename.log
- LogLevel warn
- </VirtualHost>
是不是很简单,这时候你用supervisor启动后就可以通过WEB访问了.不过这时候有具小缺点,那就是如果你killall gogs进程的话,你会发现再也启动不了了.看启动的LOG会告诉你3000端口被占用,用netstat -an|grep 3000,发现有N个进程都是在CLOSE_WAIT的情况,这时候用: echo $(netstat -anp|grep 127.0.0.1:3000 |awk '{print $7}') 发现,全是apache2占着这些进程.
怀疑刚才在强杀gogs进程的时候,apache因为开着proxy导致刚刚向3000端口发起请求就挂起,导致异常关闭所以一直卡住(以前用nginx的做反代的时候也有类似问题,只要客户机挂了,nginx必须 重启),所以我重启了一下apache,立刻发现gogs启动成功.
至此,如果要 重启gogs最好是/etc/init.d/supervisor restart ,如果不正常就再重启一下apache...
OK,问题全部解决