Java Lambda Expressions: Consumer, Supplier and Function

Lambdas are used to create function objects. With them, we can specify methods inside other methods—and even pass methods as arguments to other methods.
A lambda has a shape, one determined by its parameters and return values (if any) and their types. Classes like Function, Supplier, Consumer, accept lambdas with specific shapes.
Example expression. This program creates a Function object from a lambda expression. The lambda expression accepts one argument, an Integer, and returns another Integer.

Left side:On the left of a lambda expression, we have the parameters. Two or more parameters can be surrounded by “(” and “)” chars.

Right side:This is the return expression—it is evaluated using the parameters. It is executed and, when required, returned.

Apply:In this program, we call apply() on the Function object. This executes and returns the expression—10 is changed to 20.

Based on: Java 8

Java program that uses lambda expression

import java.util.function.*;

public class Program {
    public static void main(String[] args) {

	// Create a Function from a lambda expression.
	// ... It returns the argument multiplied by two.
	Function<Integer, Integer> func = x -> x * 2;

	// Apply the function to an argument of 10.
	int result = func.apply(10);
	System.out.println(result);
    }
}

Output

20

Supplier, lambda arguments. A Supplier object receives no arguments. We use an empty argument list to specify a lambda expression with no arguments.

Tip:A Supplier provides values. We call get() on it to retrieve its value—it may return different values when called more than once.

Java program that uses Supplier, lambdas

import java.util.function.*;

public class Program {

    static void display(Supplier<Integer> arg) {
	System.out.println(arg.get());
    }

    public static void main(String[] args) {

	// Pass lambdas to the display method.
	// ... These conform to the Supplier class.
	// ... Each returns an Integer.
	display(() -> 10);
	display(() -> 100);
	display(() -> (int) (Math.random() * 100));
    }
}

Output

10
100
21

Predicate Lambda, ArrayList. The term predicate is used in computer science to mean a boolean-returning method. A Predicate object receives one value and returns true or false.

RemoveIf:This method on ArrayList receives a Predicate. Here, we remove all elements starting with the letter “c.”

Java program that uses removeIf, Predicate lambda

import java.util.ArrayList;

public class Program {
    public static void main(String[] args) {

	// Create ArrayList and add four String elements.
	ArrayList<String> list = new ArrayList<>();
	list.add("cat");
	list.add("dog");
	list.add("cheetah");
	list.add("deer");

	// Remove elements that start with c.
	list.removeIf(element -> element.startsWith("c"));
	System.out.println(list.toString());
    }
}

Output

[dog, deer]

Consumer. Opposite a Supplier, a Consumer acts upon a value but returns nothing. It means a void method. We can use a consumer to call println or other void methods.

Also:A Consumer can be used to mutate data, as in an array, ArrayList or even just a class field.

Java program that uses Consumer

import java.util.function.*;

public class Program {

    static void display(int value) {

	switch (value) {
	case 1:
	    System.out.println("There is 1 value");
	    return;
	default:
	    System.out.println("There are " + Integer.toString(value)
		    + " values");
	    return;
	}
    }

    public static void main(String[] args) {

	// This consumer calls a void method with the value.
	Consumer<Integer> consumer = x -> display(x - 1);

	// Use the consumer with three numbers.
	consumer.accept(1);
	consumer.accept(2);
	consumer.accept(3);
    }
}

Output

There are 0 values
There is 1 value
There are 2 values

UnaryOperator. This functional object receives a value of a certain type (like Integer) and returns a same-typed value. So it operates on, and returns, a value.

Java program that uses UnaryOperator

import java.util.function.*;

public class Program {
    public static void main(String[] args) {

	// This returns one value of the same type as its one parameter.
	// ... It means the same as the Function below.
	UnaryOperator<Integer> operator = v -> v * 100;

	// This is a generalized form of UnaryOperator.
	Function<Integer, Integer> function = v -> v * 100;

	System.out.println(operator.apply(5));
	System.out.println(function.apply(6));
    }
}

Output

500
600

UnaryOperator, ArrayList. This example uses a lambda expression as a UnaryOperator argument to the ArrayList’s replaceAll method. It adds ten to all elements.ArrayList

Note:The forEach method on ArrayList does not change element values. ReplaceAll allows this action.

Java program that uses replaceAll, UnaryOperator

import java.util.ArrayList;

public class Program {
    public static void main(String[] args) {

	// Add ten to each element in the ArrayList.
	ArrayList<Integer> list = new ArrayList<>();
	list.add(5);
	list.add(6);
	list.add(7);
	list.replaceAll(element -> element + 10);
	// ... Display the results.
	System.out.println(list);
    }
}

Output

[15, 16, 17]

BiConsumer, HashMap. A BiConsumer is a functional object that receives two parameters. Here we use a BitConsumer in the forEach method on HashMap.HashMap

Note:The forEach lambda here, which is a valid BiConsumer, prints out all keys, values, and the keys’ lengths.

Java that uses BiConsumer, HashMap forEach

import java.util.HashMap;

public class Program {
    public static void main(String[] args) {

	HashMap<String, String> hash = new HashMap<>();
	hash.put("cat", "orange");
	hash.put("dog", "black");
	hash.put("snake", "green");
	// Use lambda expression that matches BiConsumer to display HashMap.
	hash.forEach((string1, string2) -> System.out.println(string1 + "..."
		+ string2 + ", " + string1.length()));
    }
}

Output

cat...orange, 3
snake...green, 5
dog...black, 3

Identifiers. These do not matter in a lambda expression. The identifiers do not impact external parts of the program, but can be accessed on both sides of the lambda.

Note:As with variables, there is no reason to name the lambda expression variable a specific thing. Here we use the word “carrot.”

Java that uses unusual lambda identifier

import java.util.function.Consumer;

public class Program {
    public static void main(String[] args) {

	// The identifier in the lambda expression can be anything.
	Consumer<Integer> consumer = carrot -> System.out.println(carrot);
	consumer.accept(1989);
    }
}

Output

1989

Function apply versus method. Here we benchmark a Function object, which we invoke with apply(), against a static method. Both code blocks do the same thing.

Result:The Function object’s apply() method is much slower than the static method. The lambda syntax has less optimization.

Thus:If a method can be called with no loss of code clarity, this may result in better performance over a lambda or functional object.

Methods

Java that times Function apply, method call

import java.util.function.*;

public class Program {

    static int method(int element) {
	return element + 1;
    }

    public static void main(String[] args) {

	Function<Integer, Integer> function = element -> element + 1;

	long t1 = System.currentTimeMillis();

	// Version 1: apply a function specified as a lambda expression.
	for (int i = 0; i < 10000000; i++) {
	    int result = function.apply(i);
	    if (result == -1) {
		System.out.println(false);
	    }
	}

	long t2 = System.currentTimeMillis();

	// Version 2: call a static method.
	for (int i = 0; i < 10000000; i++) {
	    int result = method(i);
	    if (result == -1) {
		System.out.println(false);
	    }
	}

	long t3 = System.currentTimeMillis();

	// ... Benchmark results.
	System.out.println(t2 - t1);
	System.out.println(t3 - t2);
    }
}

Results

93 ms,    Function apply()
 6 ms,    method call

Filter. This method works on streams like IntStream. It returns a modified stream. And we can use methods like findFirst to access elements from filtered streams.Filter
Sum. With a lambda expression, IntStream and the reduce() method we can sum an array. This approach has better parallel potential. But it is slow in simple cases.Sum
Lambda calculus was introduced in 1936. In this form of mathematics, a function is an object. In a higher-order function, we pass a function object as an argument to a function.

Lambda calculus is a conceptually simple universal model of computation.

Lambda calculus: Wikipedia
At first, functional object names in Java are confusing. What is a Supplier? What is a Consumer? What is supplying what, and who is consuming?
With practice, and some effort, the functional object system in this language is powerful and expressive. It is beautiful. It condenses syntax for complex logical forms.
And with more expressive programs, it becomes possible to improve program quality. Fewer bugs may crawl about. A lambda has strict requirements: the compiler helps checks its validity.

 

From here

How Java Debug works

You can just attach your IDE to a running application (which has been runned for debug as we’ll see later), or you can even debug it from command line. And the application you debug can even be be in a different machine.

The magic lies in where the debug information actually resides. Apparently people normally think that is the IDE that knows how to debug your programs, but the truth is that is the program who knows how to debug itself, and makes that information available to whoever wants to use it.

The way it works is basically the following. When you compile a program, the .class files get debug information within them, like line numbers or local variables that are made accessible to others who want to access this information. You can then run the program in debug mode passing the following options to your java program execution(you can of course run any java program like this, including mvn goals, appllication servers, etc)

User Jdb

There are many ways to start a jdb session. The most frequently used way is to have jdb launch a new Java Virtual Machine (VM) with the main class of the application to be debugged. This is done by substituting the command jdb for java in the command line. For example, if your application’s main class is MyClass, you use the following command to debug it under JDB:

C:\> jdb MyClass 

When started this way, jdb invokes a second Java VM with any specified parameters, loads the specified class, and stops the VM before executing that class’s first instruction.

Another way(used by ide)

java -agentlib:jdwp=transport=dt_shmem,address=jdbconn,server=y,suspend=n Sum 3 4

You can then attach jdb to the VM with the following commmand:

C:\> jdb -attach jdbconn

This line basically says: Run this program in debug mode, use the jdwp protocol, with a socket that listens to port 4000 and waits for connections to continue.

The jdwp protocol is a communication protocol used by the Java application to receive and issue commands and reply to them.

For example, you can connect to this port when the application is running an issue commands like “print variablex” to know the value of a variable, “stop at x” to set a breakpoint, etc. The application issues notification commands like “breakpoint reached”.
The truth is that the protocol is a little more complex than this, but this is enough to know to illustriate the point.

With the previous said, we can see that it would be even possible to debug an application with the use of Telnet! (we’ll see later)

Well, enough theory. Let’s see an example Any simple example will do. We’ll make a simple program that takes two parameters from command line and prints the sum. The program won’t be well designed (In the sense that will include some useless temp variables, no validations, etc) but will do to illustrate the example.

class Sum{
    public static void main(String[] args){
        int sum1 = Integer.parseInt(args[0]);
        int sum2 = Integer.parseInt(args[1]);
        int suma= sum1+sum2;
        System.out.println("La suma es "+suma);
    }
}

So we compile it: javac -g Sum.java (the g option adds extra debug info to the class. Like local variable names)
And we run it in debug mode:

C:\> jdb Sum

you’ll get some output like
Initializing jdb …
>

That’s it, you have a debug session started. Now the interesting. Execute the following in your jdb session:

stop at Sum:6

You now have a breakpoint on line 6. execute run on the session, and the program will run until that breakpoint. you’ll get the output

Deferring breakpoint Sum:6.
It will be set after the class is loaded.

We can then type ‘run‘ to start jvm.

Now let’s see the value of our variables: run the following commands (one at a time) on the jdb session and see the results.

print sum1
print sum2
print suma
locals
set suma = 10
locals
cont

This is pretty cool stuff. You can debug your program from command line.

 

From here and here

java String intern

What is String Interning ?

String Interning is a method of storing only one copy of each distinct String Value, which must be immutable.

In Java, String class has a public method intern() that returns a canonical representation for the string object. Java’s String class privately maintains a pool of strings, where String literals are automatically interned.

When the intern() method is invoked on a String object it looks the string contained by this String object in the pool, if the string is found there then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

The intern() method helps in comparing two String objects with == operator by looking into the pre-existing pool of string literals, no doubt it is faster than equals() method. The pool of strings in Java is maintained for saving space and for faster comparisons.Normally Java programmers are advised to use equals(), not ==, to compare two strings. This is because == operator compares memory locations, while equals() method compares the content stored in two objects.

Why and When to Intern ?

Though Java automatically interns all Strings by default, remember that we only need to intern strings when they are not constants, and we want to be able to quickly compare them to other interned strings. The intern() method should be used on strings constructed with new String() in order to compare them by == operator.

Chinese

在 JAVA 语言中有8中基本类型和一种比较特殊的类型String。这些类型为了使他们在运行过程中速度更快,更节省内存,都提供了一种常量池的概念。常量池就类似一个JAVA系统级别提供的缓存。

8种基本类型的常量池都是系统协调的,String类型的常量池比较特殊。它的主要使用方法有两种:

  • 直接使用双引号声明出来的String对象会直接存储在常量池中。
  • 如果不是用双引号声明的String对象,可以使用String提供的intern方法。intern 方法会从字符串常量池中查询当前字符串是否存在,若不存在就会将当前字符串放入常量池中

二,jdk6 和 jdk7 下 intern 的区别

相信很多 JAVA 程序员都做做类似 String s = new String("abc")这个语句创建了几个对象的题目。 这种题目主要就是为了考察程序员对字符串对象的常量池掌握与否。上述的语句中是创建了2个对象,第一个对象是”abc”字符串存储在常量池中,第二个对象在JAVA Heap中的 String 对象。

来看一段代码:

public static void main(String[] args) {
    String s = new String("1");
    s.intern();
    String s2 = "1";
    System.out.println(s == s2);

    String s3 = new String("1") + new String("1");
    s3.intern();
    String s4 = "11";
    System.out.println(s3 == s4);
}

打印结果是

  • jdk6 下false false
  • jdk7 下false true

具体为什么稍后再解释,然后将s3.intern();语句下调一行,放到String s4 = "11";后面。将s.intern(); 放到String s2 = "1";后面。是什么结果呢

public static void main(String[] args) {
    String s = new String("1");
    String s2 = "1";
    s.intern();
    System.out.println(s == s2);

    String s3 = new String("1") + new String("1");
    String s4 = "11";
    s3.intern();
    System.out.println(s3 == s4);
}

打印结果为:

  • jdk6 下false false
  • jdk7 下false false

1,jdk6中的解释

jdk6图

注:图中绿色线条代表 string 对象的内容指向。 黑色线条代表地址指向。

如上图所示。首先说一下 jdk6中的情况,在 jdk6中上述的所有打印都是 false 的,因为 jdk6中的常量池是放在 Perm 区中的,Perm 区和正常的 JAVA Heap 区域是完全分开的。上面说过如果是使用引号声明的字符串都是会直接在字符串常量池中生成,而 new 出来的 String 对象是放在 JAVA Heap 区域。所以拿一个 JAVA Heap 区域的对象地址和字符串常量池的对象地址进行比较肯定是不相同的,即使调用String.intern方法也是没有任何关系的。

2,jdk7中的解释

再说说 jdk7 中的情况。这里要明确一点的是,在 Jdk6 以及以前的版本中,字符串的常量池是放在堆的 Perm 区的,Perm 区是一个类静态的区域,主要存储一些加载类的信息,常量池,方法片段等内容,默认大小只有4m,一旦常量池中大量使用 intern 是会直接产生java.lang.OutOfMemoryError: PermGen space错误的。 所以在 jdk7 的版本中,字符串常量池已经从 Perm 区移到正常的 Java Heap 区域了。为什么要移动,Perm 区域太小是一个主要原因,当然据消息称 jdk8 已经直接取消了 Perm 区域,而新建立了一个元区域。应该是 jdk 开发者认为 Perm 区域已经不适合现在 JAVA 的发展了。

正式因为字符串常量池移动到 JAVA Heap 区域后,再来解释为什么会有上述的打印结果。

jdk7图1

  • 在第一段代码中,先看 s3和s4字符串。String s3 = new String("1") + new String("1");,这句代码中现在生成了2最终个对象,是字符串常量池中的“1” 和 JAVA Heap 中的 s3引用指向的对象。中间还有2个匿名的new String("1")我们不去讨论它们。此时s3引用对象内容是”11″,但此时常量池中是没有 “11”对象的。
  • 接下来s3.intern();这一句代码,是将 s3中的“11”字符串放入 String 常量池中,因为此时常量池中不存在“11”字符串,因此常规做法是跟 jdk6 图中表示的那样,在常量池中生成一个 “11” 的对象,关键点是 jdk7 中常量池不在 Perm 区域了,这块做了调整。常量池中不需要再存储一份对象了,可以直接存储堆中的引用。这份引用指向 s3 引用的对象。 也就是说引用地址是相同的。
  • 最后String s4 = "11"; 这句代码中”11″是显示声明的,因此会直接去常量池中创建,创建的时候发现已经有这个对象了,此时也就是指向 s3 引用对象的一个引用。所以 s4 引用就指向和 s3 一样了。因此最后的比较 s3 == s4 是 true。
  • 再看 s 和 s2 对象。 String s = new String("1"); 第一句代码,生成了2个对象。常量池中的“1” 和 JAVA Heap 中的字符串对象。s.intern(); 这一句是 s 对象去常量池中寻找后发现 “1” 已经在常量池里了。
  • 接下来String s2 = "1"; 这句代码是生成一个 s2的引用指向常量池中的“1”对象。 结果就是 s 和 s2 的引用地址明显不同。图中画的很清晰。

jdk7图2

  • 来看第二段代码,从上边第二幅图中观察。第一段代码和第二段代码的改变就是 s3.intern(); 的顺序是放在String s4 = "11";后了。这样,首先执行String s4 = "11";声明 s4 的时候常量池中是不存在“11”对象的,执行完毕后,“11“对象是 s4 声明产生的新对象。然后再执行s3.intern();时,常量池中“11”对象已经存在了,因此 s3 和 s4 的引用是不同的。
  • 第二段代码中的 s 和 s2 代码中,s.intern();,这一句往后放也不会有什么影响了,因为对象池中在执行第一句代码String s = new String("1");的时候已经生成“1”对象了。下边的s2声明都是直接从常量池中取地址引用的。 s 和 s2 的引用地址是不会相等的。

小结

从上述的例子代码可以看出 jdk7 版本对 intern 操作和常量池都做了一定的修改。主要包括2点:

  • 将String常量池 从 Perm 区移动到了 Java Heap区
  • String#intern 方法时,如果存在堆中的对象,会直接保存对象的引用,而不会重新创建对象。

三,使用 intern

1,intern 正确使用例子

接下来我们来看一下一个比较常见的使用String#intern方法的例子。

代码如下:

static final int MAX = 1000 * 10000;
static final String[] arr = new String[MAX];

public static void main(String[] args) throws Exception {
    Integer[] DB_DATA = new Integer[10];
    Random random = new Random(10 * 10000);
    for (int i = 0; i < DB_DATA.length; i++) {
        DB_DATA[i] = random.nextInt();
    }
    long t = System.currentTimeMillis();
    for (int i = 0; i < MAX; i++) {
        //arr[i] = new String(String.valueOf(DB_DATA[i % DB_DATA.length]));
         arr[i] = new String(String.valueOf(DB_DATA[i % DB_DATA.length])).intern();
    }

    System.out.println((System.currentTimeMillis() - t) + "ms");
    System.gc();
}

运行的参数是:-Xmx2g -Xms2g -Xmn1500M 上述代码是一个演示代码,其中有两条语句不一样,一条是使用 intern,一条是未使用 intern。结果如下图

2160ms
使用 intern

826ms
未使用 intern

通过上述结果,我们发现不使用 intern 的代码生成了1000w 个字符串,占用了大约640m 空间。 使用了 intern 的代码生成了1345个字符串,占用总空间 133k 左右。其实通过观察程序中只是用到了10个字符串,所以准确计算后应该是正好相差100w 倍。虽然例子有些极端,但确实能准确反应出 intern 使用后产生的巨大空间节省。

细心的同学会发现使用了 intern 方法后时间上有了一些增长。这是因为程序中每次都是用了 new String 后,然后又进行 intern 操作的耗时时间,这一点如果在内存空间充足的情况下确实是无法避免的,但我们平时使用时,内存空间肯定不是无限大的,不使用 intern 占用空间导致 jvm 垃圾回收的时间是要远远大于这点时间的。 毕竟这里使用了1000w次intern 才多出来1秒钟多的时间。

2,intern 不当使用

看过了 intern 的使用和 intern 的原理等,我们来看一个不当使用 intern 操作导致的问题。

在使用 fastjson 进行接口读取的时候,我们发现在读取了近70w条数据后,我们的日志打印变的非常缓慢,每打印一次日志用时30ms左右,如果在一个请求中打印2到3条日志以上会发现请求有一倍以上的耗时。在重新启动 jvm 后问题消失。继续读取接口后,问题又重现。接下来我们看一下出现问题的过程。

1,根据 log4j 打印日志查找问题原因

在使用log4j#info打印日志的时候时间非常长。所以使用 housemd 软件跟踪 info 方法的耗时堆栈。

  • trace SLF4JLogger.
  • trace AbstractLoggerWrapper:
  • trace AsyncLogger
org/apache/logging/log4j/core/async/AsyncLogger.actualAsyncLog(RingBufferLogEvent)                sun.misc.Launcher$AppClassLoader@109aca82            1            1ms    org.apache.logging.log4j.core.async.AsyncLogger@19de86bb  
org/apache/logging/log4j/core/async/AsyncLogger.location(String)                                  sun.misc.Launcher$AppClassLoader@109aca82            1           30ms    org.apache.logging.log4j.core.async.AsyncLogger@19de86bb  
org/apache/logging/log4j/core/async/AsyncLogger.log(Marker, String, Level, Message, Throwable)    sun.misc.Launcher$AppClassLoader@109aca82            1           61ms    org.apache.logging.log4j.core.async.AsyncLogger@19de86bb

代码出在 AsyncLogger.location 这个方法上. 里边主要是调用了 return Log4jLogEvent.calcLocation(fqcnOfLogger);Log4jLogEvent.calcLocation()

Log4jLogEvent.calcLocation()的代码如下:

public static StackTraceElement calcLocation(final String fqcnOfLogger) {  
    if (fqcnOfLogger == null) {  
        return null;  
    }  
    final StackTraceElement[] stackTrace = Thread.currentThread().getStackTrace();  
    boolean next = false;  
    for (final StackTraceElement element : stackTrace) {  
        final String className = element.getClassName();  
        if (next) {  
            if (fqcnOfLogger.equals(className)) {  
                continue;  
            }  
            return element;  
        }  
        if (fqcnOfLogger.equals(className)) {  
            next = true;  
        } else if (NOT_AVAIL.equals(className)) {  
            break;  
        }  
    }  
    return null;  
}

经过跟踪发现是 Thread.currentThread().getStackTrace(); 的问题。

2, 跟踪Thread.currentThread().getStackTrace()的 native 代码,验证String#intern

Thread.currentThread().getStackTrace();native的方法:

public StackTraceElement[] getStackTrace() {  
    if (this != Thread.currentThread()) {  
        // check for getStackTrace permission  
        SecurityManager security = System.getSecurityManager();  
        if (security != null) {  
            security.checkPermission(  
                SecurityConstants.GET_STACK_TRACE_PERMISSION);  
        }  
        // optimization so we do not call into the vm for threads that  
        // have not yet started or have terminated  
        if (!isAlive()) {  
            return EMPTY_STACK_TRACE;  
        }        StackTraceElement[][] stackTraceArray = dumpThreads(new Thread[] {this});  
        StackTraceElement[] stackTrace = stackTraceArray[0];  
        // a thread that was alive during the previous isAlive call may have  
        // since terminated, therefore not having a stacktrace.  
        if (stackTrace == null) {  
            stackTrace = EMPTY_STACK_TRACE;  
        }  
        return stackTrace;  
    } else {  
        // Don't need JVM help for current thread  
        return (new Exception()).getStackTrace();  
    }  
}  

private native static StackTraceElement[][] dumpThreads(Thread[] threads);

下载 openJdk7的源码查询 jdk 的 native 实现代码,列表如下【这里因为篇幅问题,不详细罗列涉及到的代码,有兴趣的可以根据文件名称和行号查找相关代码】:

\openjdk7\jdk\src\share\native\java\lang\Thread.c
\openjdk7\hotspot\src\share\vm\prims\jvm.h line:294:
\openjdk7\hotspot\src\share\vm\prims\jvm.cpp line:4382-4414:
\openjdk7\hotspot\src\share\vm\services\threadService.cpp line:235-267:
\openjdk7\hotspot\src\share\vm\services\threadService.cpp line:566-577:
\openjdk7\hotspot\src\share\vm\classfile\javaClasses.cpp line:1635-[1651,1654,1658]:

完成跟踪了底层的 jvm 源码后发现,是下边的三条代码引发了整个程序的变慢问题。

oop classname = StringTable::intern((char*) str, CHECK_0);  
oop methodname = StringTable::intern(method->name(), CHECK_0);  
oop filename = StringTable::intern(source, CHECK_0);

这三段代码是获取类名、方法名、和文件名。因为类名、方法名、文件名都是存储在字符串常量池中的,所以每次获取它们都是通过String#intern方法。但没有考虑到的是默认的 StringPool 的长度是1009且不可变的。因此一旦常量池中的字符串达到的一定的规模后,性能会急剧下降。

3,fastjson 不当使用 String#intern

导致这个 intern 变慢的原因是因为 fastjson 对String#intern方法的使用不当造成的。跟踪 fastjson 中的实现代码发现,

com.alibaba.fastjson.parser.JSONScanner#scanFieldSymbol()

if (ch == '\"') {
    bp = index;
    this.ch = ch = buf[bp];
    strVal = symbolTable.addSymbol(buf, start, index - start - 1, hash);
    break;
}

com.alibaba.fastjson.parser.SymbolTable#addSymbol():

/**
 * Constructs a new entry from the specified symbol information and next entry reference.
 */
public Entry(char[] ch, int offset, int length, int hash, Entry next){
    characters = new char[length];
    System.arraycopy(ch, offset, characters, 0, length);
    symbol = new String(characters).intern();
    this.next = next;
    this.hashCode = hash;
    this.bytes = null;
}

fastjson 中对所有的 json 的 key 使用了 intern 方法,缓存到了字符串常量池中,这样每次读取的时候就会非常快,大大减少时间和空间。而且 json 的 key 通常都是不变的。这个地方没有考虑到大量的 json key 如果是变化的,那就会给字符串常量池带来很大的负担。

这个问题 fastjson 在1.1.24版本中已经将这个漏洞修复了。程序加入了一个最大的缓存大小,超过这个大小后就不会再往字符串常量池中放了。

[1.1.24版本的com.alibaba.fastjson.parser.SymbolTable#addSymbol() Line:113]代码

public static final int MAX_SIZE           = 1024;

if (size >= MAX_SIZE) {
    return new String(buffer, offset, len);
}

这个问题是70w 数据量时候的引发的,如果是几百万的数据量的话可能就不只是30ms 的问题了。因此在使用系统级提供的String#intern方式一定要慎重!

五,总结

本文大体的描述了 String#intern和字符串常量池的日常使用,jdk 版本的变化和String#intern方法的区别,以及不恰当使用导致的危险等内容,让大家对系统级别的 String#intern有一个比较深入的认识。让我们在使用和接触它的时候能避免出现一些 bug,增强系统的健壮性。

引用:

以下是几个比较关键的几篇博文。感谢!

get tomcat hot swap work with intellij

It usually takes a lot of time whenever you’re recompiling a Java web project. The IDE would usually recompile the entire project, package them into a war and have it redeployed on your  application server (i.e., Tomcat) then letting the server reinitialize itself. It takes a lot of time, and it just gets worse as your project grows bigger.

Uh, Hot Swap?

Hot swapping is a more efficient process of doing this. Hot swapping works by replacing the classes that you’ve changed instead of having the entire project recompiled. The resulting classes are then replaced (or rather, hot swapped) from the application server, and everything else just runs normally, without the need for a restart since all the other components didn’t have to be reloaded.
Now, Hotswapping isn’t exactly new. Java actually supports it with its own hot swapping solution via HotSpot VM, but it wasn’t exactly usable enough. It’s dealbreaking flaws prevented you from adding new methods or adding new classes. For the most part, that just leaves you to modifying existing method bodies in existing classes. In most cases, that just isn’t enough.
DCEVM fixes that problem. It’s essentially a Java VM modification that allows the redefinition of loaded classes in runtime. This lets you do the points I mentioned above that couldn’t be achieved before.

Getting Hot Swap to Work

This guide demonstrates how to configure Hot Swapping for IntelliJ using DCEVM.

While in a fully-functional IntelliJ Java web project*

  1. Get the updated, forked version of DCEVM here: http://dcevm.github.io/
    • 64-bit is supported!
    • Works in Linux with matching build numbers
    • While unsupported, it’s currently working on Oracle Java JDK versions as well.
  2. Run the package, and choose Install DCEVM as altjvm
  3. In IntelliJ, under your project’s build configurations (Run -> Edit Configurations), make sure that your project uses exploded war artifacts, instead of the normal war packages.
    • The use of exploded war packages allows IntelliJ to update both classes AND resources whenever they are updated. That means this change will also help you in swiftly reloading JSP pages that have been updated.
  4. In the Server tab, add -XXaltjvm=dcevm to the VM options (seems to me this is optional.)
  5. Change On ‘update’ action and On frame deactivation to Update classes and resources.
    • These changes will ensure that IntelliJ will update whenever changes are made and when you’ve shifted the focus away from IntelliJ. If you don’t want to do this automatically, then set Do nothing for these options.
    • If you have the Live Edit plugin, you can tick with Javascript debugger on and all of your changes on JSP files will also reflect as soon as you’ve finished typing. I wouldn’t recommend this since it’ll trigger a refresh every time, but it’s convenient if you want a WYSIWYG experience.
The end result should look like this.
Once you’re done, simply use Debug mode when executing your web application.
Depending on your settings:
  • IntelliJ will make and compile any classes you’ve changed once you’ve switched your focus away from the IDE. Your changes will reflect as soon as you refresh the page.
  • If you’ve used HotSpot VM before (the default Hot swap), you’ll be able to create new classes and methods in both new and existing classes. Give it a try.
  • Frameworks like Spring and Hibernate are likely not fully supported, due to the nature of how they work or how they are brought up (and how it might be improperly done by hot swapping). It’s workable, but you just have to be careful when to modify or add code that relies on these frameworks.
    • If you REALLY need to, then JRebel has support for it. Unfortunately, JRebel costs money and DCEVM being free is the exact reason why I wrote this post.

 

FROM HERE

Watch out the milliseconds in java Calendar

milliseconds  problem

If you run the following test multiple times, you get different results:

public class MyTest {

public static void main(String[] args){

    Calendar calendar = new GregorianCalendar();

    calendar.set(calendar.YEAR, 2013);
    calendar.set(calendar.MONTH, 4);
    calendar.set(calendar.DATE, 24);
    calendar.set(calendar.HOUR, 12);
    calendar.set(calendar.MINUTE, 00);
    calendar.set(calendar.SECOND, 00);      

    System.out.print("Time " + calendar.getTimeInMillis());

}

result : Time 1369454400208 Time 1369454400185 Time 1369454400926

When getting Instance of calendar it has been initialized with current time. And when setting year,month etc. MILLISECOND were taken from current date.

What’s more if you get a calendar instance and  use the below api

public final void set(int year, int month, int date, int hourOfDay, int minute, int second)

And then try to compare the date with any date in Database, you will not be able to get ANY result!

This is exactly what I encountered when passing some date like this to the JPA Criteria and find no clue why no result hit..

 

some more explanation

Remember that the Calendar’s internal fields include year, month, date, hour, minutes, seconds, milliseconds and time zone. Whenever you are calling a set() method with multiple fields, like set(year, month, date), it will not affect the rest of the fields.

Remember that there is no set() method with multiple fields available to set the milliseconds. If you would like to set the milliseconds, you must use set(Calendar.MILLISECOND, value). Likewise, if you are planning to set all the fields, its a good idea to reset all the fields using clear() method. This will clear milliseconds as well.

Most of the times, millisecond field may not be of interest to you. But if you are going to use the UTC milliseconds, by calling getTimeInMillis(), then make sure you set the right values for milliseconds as well.

java vs c++

At the beginning of a new project, you may be faced with the question, “Should I use C++ (or some other language) for my next project, or should I use Java?” As an implementation language, Java has some advantages and some disadvantages over other languages. One of the most compelling reasons for using Java as a language is that it can enhance developer productivity. The main disadvantage is potentially slower execution speed.

Java is, first and foremost, an object-oriented language. One promise of object-orientation is that it promotes the re-use of code, resulting in better productivity for developers. This may make Java more attractive than a procedural language such as C, but doesn’t add much value to Java over C++. Yet compared to C++, Java has some significant differences that can improve a developer’s productivity. This productivity boost comes mostly from Java’s restrictions on direct memory manipulation.

In Java, there is no way to directly access memory by arbitrarily casting pointers to a different type or by using pointer arithmetic, as there is in C++. Java requires that you strictly obey rules of type when working with objects. If you have a reference (similar to a pointer in C++) to an object of type Mountain, you can only manipulate it as a Mountain. You can’t cast the reference to type Lava and manipulate the memory as if it were a Lava. Neither can you simply add an arbitrary offset to the reference, as pointer arithmetic allows you to do in C++. You can, in Java, cast a reference to a different type, but only if the object really is of the new type. For example, if the Mountain reference actually referred to an instance of class Volcano (a specialized type of Mountain), you could cast the Mountain reference to a Volcano reference. Because Java enforces strict type rules at run- time, you are not able to directly manipulate memory in ways that can accidentally corrupt it. As a result, you can’t ever create certain kinds of bugs in Java programs that regularly harass C++ programmers and hamper their productivity.

Another way Java prevents you from inadvertently corrupting memory is through automatic garbage collection. Java has a new operator, just like C++, that you use to allocate memory on the heap for a new object. But unlike C++, Java has no corresponding delete operator, which C++ programmers use to free the memory for an object that is no longer needed by the program. In Java, you merely stop referencing an object, and at some later time, the garbage collector will reclaim the memory occupied by the object.

The garbage collector prevents Java programmers from needing to explicitly indicate which objects should be freed. As a C++ project grows in size and complexity, it often becomes increasingly difficult for programmers to determine when an object should be freed, or even whether an object has already been freed. This results in memory leaks, in which unused objects are never freed, and memory corruption, in which the same object is accidentally freed multiple times. Both kinds of memory troubles cause C++ programs to crash, but in ways that make it difficult to track down the exact source of the problem. You can be more productive in Java primarily because you don’t have to chase down memory corruption bugs. But also, you can be more productive because when you no longer have to worry about explicitly freeing memory, program design becomes easier.

A third way Java protects the integrity of memory at run-time is array bounds checking. In C++, arrays are really shorthand for pointer arithmetic, which brings with it the potential for memory corruption. C++ allows you to declare an array of ten items, then write to the eleventh item, even though that tramples on memory. In Java, arrays are full-fledged objects, and array bounds are checked each time an array is used. If you create an array of ten items in Java and try to write to the eleventh, Java will throw an exception. Java won’t let you corrupt memory by writing beyond the end of an array.

One final example of how Java ensures program robustness is by checking object references, each time they are used, to make sure they are not null. In C++, using a null pointer usually results in a program crash. In Java, using a null reference results in an exception being thrown.

FROM HERE

ORA-00911: invalid character

ORA-00911: invalid character
Cause: identifiers may not start with any ASCII character other than letters and numbers. $#_ are also allowed after the first character. Identifiers enclosed by doublequotes may contain any character other than a doublequote. Alternative quotes (q’#…#’) cannot use spaces, tabs, or carriage returns as delimiters. For all other contexts, consult the SQL Language Reference Manual.
Action: None

Reference: http://docs.oracle.com/cd/B28359_01/server.111/b28278/e900.htm#ORA-00910

ORA-00911 exception is very common and usually occurs for common syntax mistakes. ORA-00911 occurs usually when a programmer makes one of the following mistakes

1. when a special character is added in an SQL statement with column name

SQL> select ename# from scott.emp;
select ename# from scott.emp
       *
ERROR at line 1:
ORA-00904: "ENAME#": invalid identifier

2. when some non-printable/special character added because of paste of sql statement from other editer (usually Acute` instead of quote’)

SQL> select * from scott.emp where ename like `A%`;
select * from scott.emp where ename like `A%`
                                         *
ERROR at line 1:
ORA-00911: invalid character

3. when string is not enclosed by single quotes in where clause condition

SQL> select * from emp where ename like A%;
select * from emp where ename like A%
                                    *
ERROR at line 1:
ORA-00911: invalid character

4. when a extra semicolon (;) is added to end the query

SQL> select empno from emp;;
select empno from emp;
                     *
ERROR at line 1:
ORA-00911: invalid character

5. when semicolon (;) is added to end the query in execute immediate of pl/sql

SQL> begin
  2     execute immediate 'update scott.emp set sal = sal * 1.1 where deptno=10;';
  3     commit;
  4  end;
  5  /
begin
*
ERROR at line 1:
ORA-00911: invalid character
ORA-06512: at line 2

6. when semicolon (;) is added to end the query executing from programming language like .net or java

 

From here