-
Notifications
You must be signed in to change notification settings - Fork 936
[Bug] kyuubi.frontend.thrift.binary.bind.port address already in use when run flink-yarn-session after start KyuubiServer #5957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @link3280 |
The engine should not see the configured |
I find the engines use |
Do you mean replace I think the right direction is filtering those configurations out before passing args to the engine launch command. |
@pan3793 Understood. But weird enough, I couldn't find the codes related to the filtering. Could you give me some pointers? Another possible approach is filtering config options with |
ping @pan3793 |
I use kyuubi 1.9.0 I believe @link3280 and @pan3793 would solve yarn session bugs someday later. I would like to put the informatio to reproduce bugs. This is the root user which previous connected with yarn application mode successfully. 0: jdbc:hive2://node4:18009/> select 1+111 as re;
Error: Error operating ExecuteStatement: org.apache.flink.table.api.TableException: Failed to execute sql
at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:976)
at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:1424)
at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeOperation(OperationExecutor.java:435)
at org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:197)
at org.apache.kyuubi.engine.flink.operation.ExecuteStatement.executeStatement(ExecuteStatement.scala:78)
at org.apache.kyuubi.engine.flink.operation.ExecuteStatement.runInternal(ExecuteStatement.scala:62)
at org.apache.kyuubi.operation.AbstractOperation.run(AbstractOperation.scala:173)
at org.apache.kyuubi.session.AbstractSession.runOperation(AbstractSession.scala:100)
at org.apache.kyuubi.session.AbstractSession.$anonfun$executeStatement$1(AbstractSession.scala:130)
at org.apache.kyuubi.session.AbstractSession.withAcquireRelease(AbstractSession.scala:81)
at org.apache.kyuubi.session.AbstractSession.executeStatement(AbstractSession.scala:127)
at org.apache.kyuubi.service.AbstractBackendService.executeStatement(AbstractBackendService.scala:66)
at org.apache.kyuubi.service.TFrontendService.ExecuteStatement(TFrontendService.scala:253)
at org.apache.kyuubi.shaded.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
at org.apache.kyuubi.shaded.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
at org.apache.kyuubi.shaded.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.kyuubi.shaded.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
at org.apache.kyuubi.service.authentication.TSetIpAddressProcessor.process(TSetIpAddressProcessor.scala:35)
at org.apache.kyuubi.shaded.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.flink.util.FlinkException: Failed to execute job 'collect'.
at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:2221)
at org.apache.flink.table.planner.delegation.DefaultExecutor.executeAsync(DefaultExecutor.java:95)
at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeQueryOperation(TableEnvironmentImpl.java:957)
... 21 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph.
at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$12(RestClusterClient.java:453)
at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884)
at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
at org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:272)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:575)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:943)
at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
... 3 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Not found: /v1/jobs]
at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:590)
at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$4(RestClient.java:570)
at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966)
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940)
... 4 more (state=,code=0) I new user with like this output SJCK-BASE80:/opt/bigdata/kyuubi # bin/beeline -u 'jdbc:hive2://node4:18009/;#kyuubi.engine.type=FLINK_SQL;flink.execution.target=yarn-session;yarn.application.id=application_1709174207782_0177' -n roota
[INFO] Unable to bind key for unsupported operation: backward-delete-word
[INFO] Unable to bind key for unsupported operation: backward-delete-word
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
Connecting to jdbc:hive2://node4:18009/;#kyuubi.engine.type=FLINK_SQL;flink.execution.target=yarn-session;yarn.application.id=application_1709174207782_0177
2024-03-28 11:19:57.017 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.operation.LaunchEngine: Processing roota's query[937e38e4-4cfc-4633-b2ba-0c5ae31bd784]: PENDING_STATE -> RUNNING_STATE, statement:
LaunchEngine
2024-03-28 11:19:57.019 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: Starting
2024-03-28 11:19:57.019 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Initiating client connection, connectString=10.133.195.122:2181 sessionTimeout=60000 watcher=org.apache.kyuubi.shaded.curator.ConnectionState@62389686
2024-03-28 11:19:57.021 INFO KyuubiSessionManager-exec-pool: Thread-196-SendThread(SJCK-GBASE80:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Opening socket connection to server SJCK-GBASE80/10.133.195.122:2181. Will not attempt to authenticate using SASL (unknown error)
2024-03-28 11:19:57.022 INFO KyuubiSessionManager-exec-pool: Thread-196-SendThread(SJCK-GBASE80:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Socket connection established to SJCK-GBASE80/10.133.195.122:2181, initiating session
2024-03-28 11:19:57.024 INFO KyuubiSessionManager-exec-pool: Thread-196-SendThread(SJCK-GBASE80:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Session establishment complete on server SJCK-GBASE80/10.133.195.122:2181, sessionid = 0x119ce24d0810007, negotiated timeout = 60000
2024-03-28 11:19:57.025 INFO KyuubiSessionManager-exec-pool: Thread-196-EventThread org.apache.kyuubi.shaded.curator.framework.state.ConnectionStateManager: State change: CONNECTED
2024-03-28 11:19:57.045 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.engine.ProcBuilder: Creating roota's working directory at /opt/bigdata/kyuubi/work/roota
2024-03-28 11:19:57.053 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.engine.EngineRef: Launching engine:
/usr/local/jdk-17.0.10/bin/java \
-Xmx1g \
-cp /opt/bigdata/kyuubi/externals/engines/flink/kyuubi-flink-sql-engine_2.12-1.9.0.jar:/opt/bigdata/flink/opt/flink-sql-client-1.17.2.jar:/opt/bigdata/flink/opt/flink-sql-gateway-1.17.2.jar:/opt/bigdata/flink/lib/*:/opt/bigdata/flink/conf:/opt/bigdata/hadoop/etc/hadoop:/opt/bigdata/hadoop/etc/hadoop:/opt/bigdata/hadoop/share/hadoop/common/lib/*:/opt/bigdata/hadoop/share/hadoop/common/*:/opt/bigdata/hadoop/share/hadoop/hdfs:/opt/bigdata/hadoop/share/hadoop/hdfs/lib/*:/opt/bigdata/hadoop/share/hadoop/hdfs/*:/opt/bigdata/hadoop/share/hadoop/mapreduce/lib/*:/opt/bigdata/hadoop/share/hadoop/mapreduce/*:/opt/bigdata/hadoop/share/hadoop/yarn:/opt/bigdata/hadoop/share/hadoop/yarn/lib/*:/opt/bigdata/hadoop/share/hadoop/yarn/*:/opt/bigdata/hadoop/bin/hadoop org.apache.kyuubi.engine.flink.FlinkSQLEngine \
--conf kyuubi.session.user=roota \
--conf flink.app.name=kyuubi_USER_FLINK_SQL_roota_default_4c0efb34-450f-496d-9588-12ab86698b83 \
--conf flink.execution.target=yarn-session \
--conf hive.server2.thrift.resultset.default.fetch.size=1000 \
--conf kyuubi.client.ipAddress=10.133.195.122 \
--conf kyuubi.client.version=1.9.0 \
--conf kyuubi.engine.share.level=USER \
--conf kyuubi.engine.submit.time=1711595997043 \
--conf kyuubi.engine.type=FLINK_SQL \
--conf kyuubi.frontend.protocols=THRIFT_BINARY,REST \
--conf kyuubi.ha.addresses=10.133.195.122:2181 \
--conf kyuubi.ha.engine.ref.id=4c0efb34-450f-496d-9588-12ab86698b83 \
--conf kyuubi.ha.namespace=/kyuubi_1.9.0_USER_FLINK_SQL/roota/default \
--conf kyuubi.ha.zookeeper.auth.type=NONE \
--conf kyuubi.metrics.prometheus.port=18007 \
--conf kyuubi.server.ipAddress=10.133.151.189 \
--conf kyuubi.session.connection.url=10.133.151.189:18009 \
--conf kyuubi.session.real.user=roota \
--conf yarn.application.id=application_1709174207782_0177
2024-03-28 11:19:57.055 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.engine.ProcBuilder: Logging to /opt/bigdata/kyuubi/work/roota/kyuubi-flink-sql-engine.log.0
2024-03-28 11:19:59.051 INFO Curator-Framework-0 org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: backgroundOperationsLoop exiting
2024-03-28 11:19:59.054 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Session: 0x119ce24d0810007 closed
2024-03-28 11:19:59.054 INFO KyuubiSessionManager-exec-pool: Thread-196-EventThread org.apache.kyuubi.shaded.zookeeper.ClientCnxn: EventThread shut down for session: 0x119ce24d0810007
2024-03-28 11:19:59.055 INFO KyuubiSessionManager-exec-pool: Thread-196 org.apache.kyuubi.operation.LaunchEngine: Processing roota's query[937e38e4-4cfc-4633-b2ba-0c5ae31bd784]: RUNNING_STATE -> ERROR_STATE, time taken: 2.038 seconds
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/bigdata/flink/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Exception in thread "main" java.lang.NoSuchMethodError: 'void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)'
at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:84)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:318)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:303)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1827)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:709)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:659)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:570)
at org.apache.kyuubi.Utils$.currentUser(Utils.scala:217)
at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.<init>(FlinkSQLEngine.scala:66)
at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.<clinit>(FlinkSQLEngine.scala)
at org.apache.kyuubi.engine.flink.FlinkSQLEngine.main(FlinkSQLEngine.scala)
Error: org.apache.kyuubi.KyuubiSQLException: org.apache.kyuubi.KyuubiSQLException: Exception in thread "main" java.lang.NoSuchMethodError: 'void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism(java.lang.String)'
at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:84)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:318)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:303)
at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1827)
at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:709)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:659)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:570)
at org.apache.kyuubi.Utils$.currentUser(Utils.scala:217)
at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.<init>(FlinkSQLEngine.scala:66)
at org.apache.kyuubi.engine.flink.FlinkSQLEngine$.<clinit>(FlinkSQLEngine.scala)
at org.apache.kyuubi.engine.flink.FlinkSQLEngine.main(FlinkSQLEngine.scala)
See more: /opt/bigdata/kyuubi/work/roota/kyuubi-flink-sql-engine.log.0
at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
at org.apache.kyuubi.engine.ProcBuilder.$anonfun$start$1(ProcBuilder.scala:234)
at java.base/java.lang.Thread.run(Thread.java:842)
.
FYI: The last 10 line(s) of log are:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/bigdata/flink/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
at org.apache.kyuubi.engine.ProcBuilder.getError(ProcBuilder.scala:281)
at org.apache.kyuubi.engine.ProcBuilder.getError$(ProcBuilder.scala:270)
at org.apache.kyuubi.engine.flink.FlinkProcessBuilder.getError(FlinkProcessBuilder.scala:39)
at org.apache.kyuubi.engine.EngineRef.$anonfun$create$1(EngineRef.scala:236)
at org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient.tryWithLock(ZookeeperDiscoveryClient.scala:166)
at org.apache.kyuubi.engine.EngineRef.tryWithLock(EngineRef.scala:178)
at org.apache.kyuubi.engine.EngineRef.create(EngineRef.scala:183)
at org.apache.kyuubi.engine.EngineRef.$anonfun$getOrCreate$1(EngineRef.scala:317)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.kyuubi.engine.EngineRef.getOrCreate(EngineRef.scala:317)
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:159)
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:133)
at org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36)
at org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:133)
at org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49)
at org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:133)
at org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:842) (state=,code=0) And my kyuubi-defaults.conf is
kyuubi-env.sh
|
# 🔍 Description This is the root cause of #5957. Which is accidentally introduced in b315123, thus affects 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1. `kyuubi-defaults.conf` is kind of a server side configuration file, all Kyuubi confs engine required should be passed via CLI args to sub-process. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6455 from pan3793/flink-conf-load. Closes #5957 2972fbc [Cheng Pan] Flink engine should not load kyuubi-defaults.conf Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> (cherry picked from commit fe5377e) Signed-off-by: Cheng Pan <chengpan@apache.org>
# 🔍 Description This is the root cause of #5957. Which is accidentally introduced in b315123, thus affects 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1. `kyuubi-defaults.conf` is kind of a server side configuration file, all Kyuubi confs engine required should be passed via CLI args to sub-process. ## Types of changes 🔖 - [x] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GHA. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6455 from pan3793/flink-conf-load. Closes #5957 2972fbc [Cheng Pan] Flink engine should not load kyuubi-defaults.conf Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Cheng Pan <chengpan@apache.org> (cherry picked from commit fe5377e) Signed-off-by: Cheng Pan <chengpan@apache.org>
Code of Conduct
Search before asking
Describe the bug
After start KyuubiServer, running flinksql on yarn-session failed because of Initializing server with reusing yuubi.frontend.thrift.binary.bind.port; Running flinksql on yarn-application and sparksql are both ok.
when replaced
with deprecated kyuubi.frontend.bind.port=10009,all engines are OK.
Affects Version(s)
master/1.8.0
Kyuubi Server Log Output
No response
Kyuubi Engine Log Output
Kyuubi Server Configurations
Kyuubi Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: